Help with testing the AP+Client Bug

If your problem doesn't fall into one of the other categories, report it here.

Moderator: Moderators

User avatar
DoesItMatter
Moderator
Posts: 1373
Joined: Thu May 21, 2009 3:56 pm

Re: Help with testing the AP+Client Bug

Post by DoesItMatter »

I had another USB Wireless G adapter - Airlink 101 XR Mimo

It uses the RaLink chipset, not the RTL chipsets

Plugged it in, used default Windows update drivers.

Still using Gargoyle Atheros 1.0.2 hotfix

Able to connect to the Fonera no problem

Another wireless card I have thats able to connect is an old Netgear WG311 V2 card that is able to connect to the Fon as well.

Researching the internet, looks like the Netgear WG311 cards have an Atheros chipset.

-----------------------

So, with this hotfix on Gargoyle 1.0.2, I can connect with an RaLink chipset, and an Atheros chipset on the wireless cards.

I am so far having trouble with the RTL8185 chipset - Realtek

Anyone else with a Realtek card having connection issues?

Again - NOT a problem for me - just a potential bug or incompatibility with the older driver set in this hotfix.
:twisted: Soylent Green Is People! :twisted:
2x Asus RT-N16 = Asus 3.0.0.4.374.43 Merlin
2x Buffalo WZR-HP-G300NH V1 A0D0 = Gargoyle 1.9.x / LEDE 17.01.x
2x Engenius - ESR900 Stock 1.4.0 / OpenWRT Trunk 49400

crisman
Posts: 25
Joined: Sat May 09, 2009 3:15 pm

Re: Help with testing the AP+Client Bug

Post by crisman »

Eric wrote:The changes I applied are in the svn... but like I said this is a horrible ugly hack. It isn't that I applied a patch to revert previous patches, as I might do for a minor issue, it's that I reverted the entire madwifi driver, hostapd and wpa_supplicant (since the code for the latter two is influenced by the first) back to r13000 of the openwrt trunk.

Some of those patches may be useful... but it's very clear that in aggregate they break the system horribly (at least in ap+sta mode).

While I could spend the next month or so figuring out what the actual root cause of this is, I'm not sure that's the best use of my time. I can generally figure out code quickly but the madwifi driver code base is huge! Also, it is possible it will get fixed by the openwrt developers (who have a history of working with madwifi, and probably can find the root cause easier than I can).

Therefore, as people seem to be agreeing that this "solution" works, I'm inclined to leave in the ugly hack. My plan is to put the issue on the back-burner for now, unless it's clear madwifi is still causing problems.
OK, That sounds good.
I have an idea ( but I don't know if it is really a very good idea ):
why don't we try to find the revision that introduced the problem using the binary search?
We have a lower bound ( r13000 seems to work without issues) and we can use as upper bound the latest revision (but also another). So, in ln2(rlast-r13000) we can find in which revision is the problem. That should translate in 12-13 test builds.

What do you think?

Eric
Site Admin
Posts: 1443
Joined: Sat Jun 14, 2008 1:14 pm

Re: Help with testing the AP+Client Bug

Post by Eric »

I'd say you'll likely (I'd give it 90% probability) find that changeset 13708 is the one that breaks it, though I haven't run the test to confirm this.

The problem is that this changeset is enormous, including a patch that adds a new HAL. You are right though that a binary search is definitely the way to go to verify it.. the question is what you do when you know the changeset and there are changes in several hundred files, not including the binary HAL.

crisman
Posts: 25
Joined: Sat May 09, 2009 3:15 pm

Re: Help with testing the AP+Client Bug

Post by crisman »

I tried r13707 and problem is present. So the bug is in a changeset between r13001 and r13706.

I think we must find the wrong changeset first. After that, if there are a lot of patches we can open a specific ticket to help openwrt developers to find the problem, or if there are small changes, we can see what's wrong.

EDIT:

Another idea could be try using madwifi from this svn: http://trac.fonosfera.org/fon-ng/browse ... on/madwifi

This is the svn for the firmware used in the fonera 2.0.
So we can use this madwifi, adding some little code to support the nanostation antenna like:

Code: Select all

config_get_bool softled "$device" softled 1
devname="$(cat /proc/sys/dev/$device/dev_name)"
antgpio=
case "$devname" in
    NanoStation2) antgpio=7;;
    NanoStation5) antgpio=1;;
esac
if [ -n "$antgpio" ]; then
    softled=0
    config_get antenna "$device" antenna
    case "$antenna" in
       external) antdiv=0; antrx=1; anttx=1 ;;
       horizontal) antdiv=0; antrx=1; anttx=1 ;;
       vertical) antdiv=0; antrx=2; anttx=2 ;;
       auto) antdiv=1; antrx=0; anttx=0 ;;
    esac                       
    [ -x "$(which gpioctl 2>/dev/null)" ] || antenna=
    case "$antenna" in
        horizontal|vertical|auto)
            gpioctl "dirout" "$antgpio" >/dev/null 2>&1
            gpioctl "set" "$antgpio" >/dev/null 2>&1
            ;;
	external)
	    gpioctl "dirout" "$antgpio" >/dev/null 2>&1
            gpioctl "clear" "$antgpio" >/dev/null 2>&1
            ;;
    esac
fi
	
[ -n "$antdiv" ] && sysctl -w dev."$device".diversity="$antdiv" >&-
[ -n "$antrx" ] && sysctl -w dev."$device".rxantenna="$antrx" >&-
[ -n "$anttx" ] && sysctl -w dev."$device".txantenna="$anttx" >&-
[ -n "$softled" ] && sysctl -w dev."$device".softled="$softled" >&-

Eric
Site Admin
Posts: 1443
Joined: Sat Jun 14, 2008 1:14 pm

Re: Help with testing the AP+Client Bug

Post by Eric »

I beg to differ. I made a build based on madwifi in r13707 and it seems to work fine (current uptime with ap+client and wpa encryption: 3 hours and counting).

I still need to try 13708 (I'll do that tomorrow). The 13707 rootfs and lzma files are linked below -- let me know if they work for you as they seem to for me.

gargoyle_madwifi_13707.squashfs


gargoyle_madwifi_13707.lzma

User avatar
DoesItMatter
Moderator
Posts: 1373
Joined: Thu May 21, 2009 3:56 pm

Re: Help with testing the AP+Client Bug

Post by DoesItMatter »

I left Fon Flash running and flashed this above madwifi test version before I left for work.

I'll try the AP + Client mode as soon as I get home tomorrow and report results.

I don't know if I'll have time to test the RTL8185 card, but I'll definitely test with the Airlink USB adapter I have - Ralink chipset.

---------------------------------

OK.

@ Home and loaded up this madwifi test version.

Running AP + Client mode. Been up about 30 mins, no disconnects, actually pretty good connection.

I'm running on this right now as I post this message, and also have a streaming movie playing over the internet.

Movie streams well, and the connection is keeping up, no drops/disconnects.

Fonera is connected via DHCP Wireless to a WEP router.

My computer is connected to the Fonera AP portion via Airlink 101 USB wireless G adapter with RaLink chipset and using WPA-PSK encryption.

It looks like this version - 13707? is working as Client + AP

------------------

Looks like its been up almost 8 hours, still connected and working!
If you throw up the 13708 build or another suspect build, I can load it up and see whether or not I have success with it.

Since I now have a working config, I can copy down all the info - wasn't much setup anyway - and try out the new builds to see if they work.
:twisted: Soylent Green Is People! :twisted:
2x Asus RT-N16 = Asus 3.0.0.4.374.43 Merlin
2x Buffalo WZR-HP-G300NH V1 A0D0 = Gargoyle 1.9.x / LEDE 17.01.x
2x Engenius - ESR900 Stock 1.4.0 / OpenWRT Trunk 49400

crisman
Posts: 25
Joined: Sat May 09, 2009 3:15 pm

Re: Help with testing the AP+Client Bug

Post by crisman »

crisman wrote: Another idea could be try using madwifi from this svn: http://trac.fonosfera.org/fon-ng/browse ... on/madwifi

This is the svn for the firmware used in the fonera 2.0.
Hello, I tried this solution compiling madwifi,hostapd,wpa_supplicant from the above svn. But the problem is that using this modified firmware the wifi0 device doesn't show, so I cannot use wifi.

I think we need to modify some code, but I don't know what. Can you look at that? I think this svn is good because I have a fonera 2.0 and ap+client is working.

Eric
Site Admin
Posts: 1443
Joined: Sat Jun 14, 2008 1:14 pm

Re: Help with testing the AP+Client Bug

Post by Eric »

I've been testing, and at this point I'm fairly sure that it's r15465 (not 13708) that breaks things.

There are a bunch of patches applied to that revision too, though, so it's still not clear-cut which patch/piece of code is responsible.

Also... I would like some help verifying that I've got the right revision, (since like I keep saying the symptoms of this bug sometimes take a little while to pop up and I may have missed something.).

Here are both the 15464 & 15465 firmware: If some of you could help verify that 15464 DOES work and that 15465 DOES NOT work, that would be very helpful.

15464:
gargoyle_1.0.2_m15464-atheros-root.squashfs

gargoyle_1.0.2_m15464-atheros-vmlinux.lzma

gargoyle_1.0.2_m15464-atheros-vmlinux.gz

15465:
gargoyle_1.0.2_m15465-atheros-root.squashfs

gargoyle_1.0.2_m15465-atheros-vmlinux.lzma

gargoyle_1.0.2_m15465-atheros-vmlinux.gz

User avatar
DoesItMatter
Moderator
Posts: 1373
Joined: Thu May 21, 2009 3:56 pm

Re: Help with testing the AP+Client Bug

Post by DoesItMatter »

This is a cut/paste from other announcement thread
-------------------------------------------------

Fonera 2201+

I made sure I did an "fis init -f"

I always fresh type in settings, no loading of old settings.

Loaded up the test 1.0.2 build 15464

Client + AP mode = Working fine here

I have the Airlink 101 Mimo USB wireless RaLink chipset

Client is hooked up via WEP 128bit

AP mode of Fonera is using WPA2-PSK

Running an online movie now and if it does 2 hours of uptime, I'll
load up the 15465 and see if that works or doesn't work.

-------------------------

Edit: 15464 might NOT work...

It worked for about 25 minutes, then dropped connection and wouldn't re-connect.

13707 had a tested uptime of 8 hours +
I didn't test it any longer because I don't use AP + Client mode for my setup.

I'm freshly loading 15465 to see if there is any difference.

-------------------------

Fresh load of 15465 - same steps as above.

This time, only worked for 5 minutes then dropped connection and wouldn't re-connect.

So doesn't look like either 15464 or 15465 might be the solution, maybe a little earlier build but after 13707 which had 8+ hours of uptime.
:twisted: Soylent Green Is People! :twisted:
2x Asus RT-N16 = Asus 3.0.0.4.374.43 Merlin
2x Buffalo WZR-HP-G300NH V1 A0D0 = Gargoyle 1.9.x / LEDE 17.01.x
2x Engenius - ESR900 Stock 1.4.0 / OpenWRT Trunk 49400

Eric
Site Admin
Posts: 1443
Joined: Sat Jun 14, 2008 1:14 pm

Re: Help with testing the AP+Client Bug

Post by Eric »

Ok, thanks. I guess 15464 doesn't work. Thanks for helping test -- this is very helpful, and this is exactly why I wanted you to check what I found. The intermittent nature of this bug is really, really annoying.

Let's go back to r13708. We know 13707, works... but I'm the only one who tested 13708. I really put that revision through a bunch of tests (waited to see uptime of over 12 hours), because I was so suspicious of it, but I didn't see any problems.

So.. if 13708 works and 15464 fails, that's GREAT news, because the changes in between are all really minor and if we find the right revision, it should be easy to correct. If 13708 fails though... we're in trouble. There's so many changes in madwifi (including a new HAL!) between 13707 and 13708 that it will be very difficult to track down the right one.

Anyway, here's the 13708 build. Let me know if it works (and cross your fingers that you find this one works!)

13708:
gargoyle_1.0.2_m13708-atheros-root.squashfs

gargoyle_1.0.2_m13708-atheros-vmlinux.lzma

gargoyle_1.0.2_m13708-atheros-vmlinux.gz

Post Reply