WRT54GL impossible AA fw & Backfire problems.
Posted: Fri Mar 15, 2013 7:09 pm
Hello forum members.
1st time posting here, so I'll briefly introduce myself.
IT guy & uber nerd by trade, been in the business.. approaching 20 yrs now. I have been happily setting up routers /w OpenWRT for going on 5yrs now and I LOVE IT!! Stock FW is *barf*.
I'll say upfront, I know a fair amount about Linux, but it's not my strong point, only from a lack of dedicating myself to it more.
So I'm by no means a Linux expert and I'm not a programmer, but I can manipulate existing code and hack together some things when I have to.
So with that said, I'll get down to business.
I'll say, this post is a bit long, but it's not a book or anything. I'm really detailing 2 separate, but related things with some background at the start.
I have an WRT54GL v1.1 router that was running WhiteRussian for about 2/2.5yrs. Quite happily I might add. Until, one day, it just wouldn't communicate on the WAN port anymore. The rest of it was fine, SSH BusyBox worked, don't recall if the X-WRT GUI did, pretty sure it was fine. I backed it all up, then wiped it back to factory Linksys FW. Worked just fine. Wiped NVRAM from term. before the switch.
Tried CoovaAP 2.x beta.. actually more like alpha, HAHA. Big FAIL!! Slow, nearly unusable and in general I became extremely frustrated with the lack of community support, documentation and a major glaring bug preventing a functional Captive Portal. So I scrapped it.
Did a TFTP of the latest Gargoyle v1.5.7 Attitude Adjustment.
WOW!! Another HUGE FAIL!!!
Took it about 3min to boot to the point it'd respond to pings, intermittently, the WAN light to stop blinking and the SES light to come on. Then it'd take at least another 2-3 minutes before I could even load the GUI. It wasn't quite this bad @ 1st login, but as soon as I set the root PW and rebooted, it went down the tubes. Quite often it wouldn't even boot all the way, sometimes no ping, sometimes ping, but no GUI. SSH I didn't consistently try. When I did get into the GUI, it was an absolute CRAWL!! Unusable.
Even tried 30-30-30 reset, no go, then back to factory FW which boots in about 20sec. Then TFTPd in AA ver. once more and still, same result.
It's possible the router is bonkers, I haven't ruled that out yet. I'd need to test it on factory FW, or find a functioning WR .trx/.bin image, as X-WRT went bye-bye... at least as of now there are no functioning builds at their new GoogleCode home. I suppose I could restore the image backups I made, but something tells me they'll still have a dead WAN iface.
I'd test Garg out on my other, identical router, but it's my production router. I don't want to do that. Bad, bad!! LOL
BTW, on WhiteRussian it booted in less than 20sec, at least as fast as factory, if not actually faster.
So I TFTPd the latest stable release from the Backfire builds.
This was muuuuch better. Still takes about 1.5min before I get GUI access, but that's not the end of the universe or anything, so long as it works right. GUI's fast, internet seems faster than WR, but no formal testing, just "feel". Using less ram than AA & WR IIRC.
Here's where the problems come in, I downloaded, using BT, the latest Quantal-Quetzal 12.10 i386 LUBUNTU release. No other torrents were running, it took about 15min to download on my 12Mb DSL conn. Total active torrent time, about 20min. I shut down uTorrent then went over to the Win7 lappy that I was going to be tossing it on. WiFi now disconnected, it sees the network SSID, just fails to connect, quite quickly I might add. Tried my Android phone too, no go.
So I did an IF UP/DOWN for wifi0 using the command "wifi" from the shell. This solved it.
Well, later on that same day, I found after doing another torrent, different distro, no other torrents running. That the internet connection went down about 90% in, modem still up. BTW, It's in PPoE bridged mode, the only way to fly.
So the modem is just a TA "Terminal Adapter".
Couldn't DNS resolve, let alone ping out to WAN/Inet. After about a minute or so, it came back online. This whole time the router wasn't responding to pings and I couldn't access it even when it was. I don't believe it rebooted, as it takes it much longer than that to do so. But i didn't go into the wire closet to check it.
I have known about the stability bug in the Broadcom open source b43 driver running on the bcm47xx (kernel 2.6) platform for a while. Where in it will cause a lock-up situation due to a ring-buffer overflow, a fix has been slow to progress as it has been rather difficult to track down the bug. Well, it looks like much progress has been made, at least it has been partially fixed as of the beginning of this week and has been sent upstream of the latest AA & BF OpenWRT builds.
They still have some memory DMA management to build in, but the main issue is fixed, at the expense of about 512K memory increase. Which is a fair trade off for the time being until the memory issues are resolved further down the road.
Obviously this wasn't an issue in WR because it used the older 2.4.x kernel /w the proprietary Broadcom driver.
Here are the relevant links.
Upstream b43 Driver Patch
Bug Ticket Tracker Thread - Status: Closed/Fixed
Rather Extensive Discussion With Gory Details on Code, Debugging, etc..
Do understand that I'm testing this in the exact same environment, on the exact same connection /w the exact same equipment as I have been running my WR router @ the shop for years. Wherein I have had ZERO issues with lock-ups, stability, performance bottlenecks, etc... I can totally hose my connection for hours on end /w torrents and it runs just peachy fine. VPNs are great, RDP, LMI, TOR, whatever. It all works fine on WR on this network.
I really want to like Garg, I want to stay with it, this being my 1st foray into Garg. Another massively glaring problem is it behaves like a pesky Netgear box.... *shudders*. In that every time you change anything, even sneeze at it, it has to reboot itself. Something so simple as changing the encryption key for the WiFi interface and it has to reboot. Seeing as I can SSH into it, edit '/etc/config/wireless' and change it there, very swiftly and easily /w VI, save, then issue the 'wifi' command to bring down the iface and bring it back up, taking all of 30sec, tops and all without a full system reboot. It just seems SILLY! for Garg to reboot it every time you change even something minor. Is this a problem /w Backfire? I have yet to run a vanilla build of BF on a router. All my WR routers have been working so beautifully that I have just let them be.
To recap, my questions are as follows.
1. Can the upstream fix for the b43 proprietary driver be brought in from the aforementioned link? Is this something that can be done without re-compiling the whole she-bang? It's a whoppin' 1 line change of a 2 digit number to a 3 digit number. IE. Can't just edit it into the existing build? I couldn't find it anywhere in the JFFS, so I suspect not, unless I'm missing it's location. Which is highly possible.
2. Do you have any 2.6x builds /w the proprietary BRCM driver? As apparently that's possible, according to an entry in this FAQ. Reference URL linked 2nd.
2.6 Kernel /w Proprietary Broadcom Drivers
Thread Referencing Source of Edit
3. Why won't AA run on this router? It uses about as much memory as WR, which is more than BF. CPU usage is much lower than WR, as far as I can tell. I know it's an old, slow, pokey router by today's standards. But not really and besides, it works fantastically for some pretty high load environments. So long as one is using OpenWRT WR.
Sorry for the long post, but I felt all, or most of that information was necessary to make an informed assessment and thus a direction to go in. Whichever of #1/#2 will fix this glaring BRCM issue would be super!!
Thanks everyone.
1st time posting here, so I'll briefly introduce myself.
IT guy & uber nerd by trade, been in the business.. approaching 20 yrs now. I have been happily setting up routers /w OpenWRT for going on 5yrs now and I LOVE IT!! Stock FW is *barf*.
I'll say upfront, I know a fair amount about Linux, but it's not my strong point, only from a lack of dedicating myself to it more.

So with that said, I'll get down to business.
I'll say, this post is a bit long, but it's not a book or anything. I'm really detailing 2 separate, but related things with some background at the start.
I have an WRT54GL v1.1 router that was running WhiteRussian for about 2/2.5yrs. Quite happily I might add. Until, one day, it just wouldn't communicate on the WAN port anymore. The rest of it was fine, SSH BusyBox worked, don't recall if the X-WRT GUI did, pretty sure it was fine. I backed it all up, then wiped it back to factory Linksys FW. Worked just fine. Wiped NVRAM from term. before the switch.
Tried CoovaAP 2.x beta.. actually more like alpha, HAHA. Big FAIL!! Slow, nearly unusable and in general I became extremely frustrated with the lack of community support, documentation and a major glaring bug preventing a functional Captive Portal. So I scrapped it.
Did a TFTP of the latest Gargoyle v1.5.7 Attitude Adjustment.
WOW!! Another HUGE FAIL!!!
Took it about 3min to boot to the point it'd respond to pings, intermittently, the WAN light to stop blinking and the SES light to come on. Then it'd take at least another 2-3 minutes before I could even load the GUI. It wasn't quite this bad @ 1st login, but as soon as I set the root PW and rebooted, it went down the tubes. Quite often it wouldn't even boot all the way, sometimes no ping, sometimes ping, but no GUI. SSH I didn't consistently try. When I did get into the GUI, it was an absolute CRAWL!! Unusable.
Even tried 30-30-30 reset, no go, then back to factory FW which boots in about 20sec. Then TFTPd in AA ver. once more and still, same result.
It's possible the router is bonkers, I haven't ruled that out yet. I'd need to test it on factory FW, or find a functioning WR .trx/.bin image, as X-WRT went bye-bye... at least as of now there are no functioning builds at their new GoogleCode home. I suppose I could restore the image backups I made, but something tells me they'll still have a dead WAN iface.

I'd test Garg out on my other, identical router, but it's my production router. I don't want to do that. Bad, bad!! LOL
BTW, on WhiteRussian it booted in less than 20sec, at least as fast as factory, if not actually faster.
So I TFTPd the latest stable release from the Backfire builds.
This was muuuuch better. Still takes about 1.5min before I get GUI access, but that's not the end of the universe or anything, so long as it works right. GUI's fast, internet seems faster than WR, but no formal testing, just "feel". Using less ram than AA & WR IIRC.
Here's where the problems come in, I downloaded, using BT, the latest Quantal-Quetzal 12.10 i386 LUBUNTU release. No other torrents were running, it took about 15min to download on my 12Mb DSL conn. Total active torrent time, about 20min. I shut down uTorrent then went over to the Win7 lappy that I was going to be tossing it on. WiFi now disconnected, it sees the network SSID, just fails to connect, quite quickly I might add. Tried my Android phone too, no go.
So I did an IF UP/DOWN for wifi0 using the command "wifi" from the shell. This solved it.
Well, later on that same day, I found after doing another torrent, different distro, no other torrents running. That the internet connection went down about 90% in, modem still up. BTW, It's in PPoE bridged mode, the only way to fly.

Couldn't DNS resolve, let alone ping out to WAN/Inet. After about a minute or so, it came back online. This whole time the router wasn't responding to pings and I couldn't access it even when it was. I don't believe it rebooted, as it takes it much longer than that to do so. But i didn't go into the wire closet to check it.
I have known about the stability bug in the Broadcom open source b43 driver running on the bcm47xx (kernel 2.6) platform for a while. Where in it will cause a lock-up situation due to a ring-buffer overflow, a fix has been slow to progress as it has been rather difficult to track down the bug. Well, it looks like much progress has been made, at least it has been partially fixed as of the beginning of this week and has been sent upstream of the latest AA & BF OpenWRT builds.
They still have some memory DMA management to build in, but the main issue is fixed, at the expense of about 512K memory increase. Which is a fair trade off for the time being until the memory issues are resolved further down the road.
Obviously this wasn't an issue in WR because it used the older 2.4.x kernel /w the proprietary Broadcom driver.
Here are the relevant links.
Upstream b43 Driver Patch
Bug Ticket Tracker Thread - Status: Closed/Fixed
Rather Extensive Discussion With Gory Details on Code, Debugging, etc..
Do understand that I'm testing this in the exact same environment, on the exact same connection /w the exact same equipment as I have been running my WR router @ the shop for years. Wherein I have had ZERO issues with lock-ups, stability, performance bottlenecks, etc... I can totally hose my connection for hours on end /w torrents and it runs just peachy fine. VPNs are great, RDP, LMI, TOR, whatever. It all works fine on WR on this network.
I really want to like Garg, I want to stay with it, this being my 1st foray into Garg. Another massively glaring problem is it behaves like a pesky Netgear box.... *shudders*. In that every time you change anything, even sneeze at it, it has to reboot itself. Something so simple as changing the encryption key for the WiFi interface and it has to reboot. Seeing as I can SSH into it, edit '/etc/config/wireless' and change it there, very swiftly and easily /w VI, save, then issue the 'wifi' command to bring down the iface and bring it back up, taking all of 30sec, tops and all without a full system reboot. It just seems SILLY! for Garg to reboot it every time you change even something minor. Is this a problem /w Backfire? I have yet to run a vanilla build of BF on a router. All my WR routers have been working so beautifully that I have just let them be.
To recap, my questions are as follows.
1. Can the upstream fix for the b43 proprietary driver be brought in from the aforementioned link? Is this something that can be done without re-compiling the whole she-bang? It's a whoppin' 1 line change of a 2 digit number to a 3 digit number. IE. Can't just edit it into the existing build? I couldn't find it anywhere in the JFFS, so I suspect not, unless I'm missing it's location. Which is highly possible.
2. Do you have any 2.6x builds /w the proprietary BRCM driver? As apparently that's possible, according to an entry in this FAQ. Reference URL linked 2nd.
2.6 Kernel /w Proprietary Broadcom Drivers
Thread Referencing Source of Edit
3. Why won't AA run on this router? It uses about as much memory as WR, which is more than BF. CPU usage is much lower than WR, as far as I can tell. I know it's an old, slow, pokey router by today's standards. But not really and besides, it works fantastically for some pretty high load environments. So long as one is using OpenWRT WR.

Sorry for the long post, but I felt all, or most of that information was necessary to make an informed assessment and thus a direction to go in. Whichever of #1/#2 will fix this glaring BRCM issue would be super!!
Thanks everyone.
