PPPoE WAN bug?

NL2009 · Post by **NL2009** » Thu Jan 07, 2010 7:36 am

Hi Eric

I am still experiencing this problem on 1.0.15.

This can be a problem if there is a temporary power outage and the router does not automatically re-establish the PPPoE link - that is if I am not at home to sort it out

Please let me know if I can supply any log files that may help in isolating this bug.

Post by **pbix** » Fri Jan 22, 2010 7:41 pm

Hopefully the OP is still around on this because I have some info that might help on this.

Several months ago I posted on how to access your DSL modem's web interface. The suggestions was to put an ifconfig and several iptables commands in the uci_firewall script.

I think that this problem may be related to those commands. Once I removed them I think that PPPoE is coming up correctly after reboot.

So my question to the original poster and anyone else having this issue is to ask if they have the mentioned commands in their uci_firewall script. If so remove the commands and report if that improves this problem for you.

Not sure yet how the two are related but this is what I have noticed.

NL2009 · Post by **NL2009** » Sat Jan 23, 2010 7:50 am

Hi

I have never used the uci_firewall script but am also experiencing this bug - my router *never* automatically re-establishes a PPPoE link after power up.

I have checked the the "uci_firewall.sh" file on the router and it looks unmodified i.e.

Code: Select all

#!/bin/sh 
# Copyright (C) 2008 John Crispin <blogic@openwrt.org>

. /etc/functions.sh

IPTABLES="echo iptables"
IPTABLES=iptables

config_clear
include /lib/network
scan_interfaces

CONFIG_APPEND=1
config_load firewall

config fw_zones
ZONE_LIST=$CONFIG_SECTION

CUSTOM_CHAINS=1
DEF_INPUT=DROP
DEF_OUTPUT=DROP
DEF_FORWARD=DROP

load_policy() {
	config_get input $1 input
...etc...
}

Post by **pbix** » Sat Jan 23, 2010 1:22 pm

Well so much for that theory.

Could you enable your pppd log file and see if you get the same message as I am getting? The log file name can be found in /etc/ppp/options. Just change it to /tmp/ppplog

The reboot and see what it says.

Post by **Eric** » Tue Jan 26, 2010 9:50 am

In the next few days I'm going to make a release where I bump the version to the latest OpenWrt.

Has anyone tried the latest OpenWrt (8.09.2) and found whether this is working? Maybe this has been fixed upstream...

Post by **pbix** » Tue Feb 02, 2010 8:16 am

Eric,
Today I built Gargoyle out of the trunk and loaded it into my router. My router is now running Kamikaze 8.09.02.

This problem unfortunately remains. Sometimes pppd starts correctly and sometimes not.

Thinking this to be an OpenWRT issue I opened a thread over there. https://forum.openwrt.org/viewtopic.php?id=23255

There does not seem to be many people having this issue with OpenWRT itself. I guess I should load OpenWRT on my router and see if this issue remains.

I saw an older OpenWRT ticket which seems related
https://dev.openwrt.org/ticket/2781

So in short it remains an issue.

On other thing I have learned is that I can restore the connection by performing an "ifup wan" from the command line.

Post by **Eric** » Wed Feb 03, 2010 2:00 am

It would be really helpful if you could install/configure 8.09.2 and see if you still see the problem. I suspect you will (since I've barely touched PPPoE configuration code), but it would be good to confirm.

Post by **pbix** » Fri Feb 05, 2010 9:13 am

I did some testing with OpenWRT 8.09.02.

I noticed a different behaviour there. The following is the OpenWRT PPP log file.

Code: Select all

Plugin rp-pppoe.so loaded.
Timeout waiting for PADO packets
Unable to complete PPPoE Discovery
Timeout waiting for PADO packets
Unable to complete PPPoE Discovery
Timeout waiting for PADO packets
Unable to complete PPPoE Discovery
PPP session is 3133
Using interface ppp0
Connect: ppp0 <--> eth0.1
Couldn't increase MTU to 1500
Couldn't increase MRU to 1500
PAP authentication succeeded
PAP authentication succeeded
peer from calling number 00:90:1A:A0:BF:3B authorized

It is interesting that OpenWRT retries the connection at 30 second intervals until is succeeds. Now I saw this same behavior once with Gargoyle but most of the time there is only one try.

Code: Select all

Plugin rp-pppoe.so loaded.
Timeout waiting for PADO packets
Unable to complete PPPoE Discovery
System time change detected.
Terminating on signal 15

I think its interesting that the string

Code: Select all

System time change detected.

Appears here.

This may be an important clue. How does the system time initialization of Gargoyle differ from that of OpenWRT? It may be that this initialization is interfering with the retry mechanism.

Post by **Eric** » Mon Feb 08, 2010 12:13 am

Ahhh... you may have found the problem (but not necessarily what the solution should be.)

The problem is that the bandwidth monitor and quotas absolutely rely on having the right date. If the date goes back to 1970, data is lost (because it thinks the time has massively changed). However, the initial date is set by ntpclient which gets the exact time from the network. If that's slow in coming up, all bandwidth monitor data would disappear unless another mechanism were in place.

So, in the ntpclient init script (/etc/init.d/ntpclient), I load a time backup file from /usr/data/time_backup, which gets written every 4 hours (note a cron job also gets setup here to do this). That way, the time is set to somewhere that is at least in the ballpark, which the bandwidth monitor can handle a lot better than if the time is at 1970. This may be interfering with PPoE getting initialized.

Pbix, since you have a PPPoE connection to test this easily and now have commit access, can you look into this, see if 1) removing the automatic time update at the beginning of /etc/init.d/ntpclient causes PPPoE to work correctly and 2) if this is the problem see if you can find a clever solution that can restore time properly for bandwidth monitor/quotas, but doesn't kill PPPoE? One option may be to split the initial time restore from ntpclient and put the priority before the network script (low START, STOP priority numbers).

Thanks!

Post by **pbix** » Mon Feb 08, 2010 9:15 am

Eric,
As a test I commented

Code: Select all

date -u -s $(cat /usr/data/time_backup) >/dev/null 2>&1

in ntpclient to see if that would change anything. I can report that if I do this it seems that the pppd retries just like OpenWRT and eventually it connects. So this line seems related to this problem.

Now the solution I am still thinking about. I tried changing ntpclient to run just before network thinking that if the time change occurred before pppd started to run that would fix things. The result is that the original issue returns. One retry followed by this "time change detected" message.

I also tried moving the time change fragment into set_kernal_timezone but still no luck.

So now I am pondering. I looked at the source code for pppd and it seems to me that pppd itself does not care about time changes. The message is just informative and the code should work regardless.

It seems to me that this problem is occurring in the program that is calling pppd trying to start the link. That program stops retries when this time change occurs. Can you outline how this retry mechanism works?

Gargoyle Forum

PPPoE WAN bug?

Re: PPPoE WAN bug?

Re: PPPoE WAN bug?

Re: PPPoE WAN bug?

Re: PPPoE WAN bug?

Re: PPPoE WAN bug?

Re: PPPoE WAN bug?

Re: PPPoE WAN bug?

Re: PPPoE WAN bug?

Re: PPPoE WAN bug?

Re: PPPoE WAN bug?