I am able to solve the problem by logging in to failsafe mode and deleting a log file: It seems that the problem is due to running out of space. (See below for details on solution.)
Does anyone have any good chron scripts to run to truncate log files and/or alert an admin when running out of space?
I am including details below to help someone else (or myself when I have this problem again and forget what I did):
---------------------------------------
After a few months of use (5 months in my case) the router becomes unresponsive. Rebooting the router does not succeed: the router gets stuck in a boot loop.
More precisely, if pinging the router from a host while booting, the router does begin responding to pings after about 30 seconds, but about 30 seconds longer stops responding to pings. Likewise, SSH and even the web admin does come up (if you time it just right)--but only for a few seconds.
Booting the router into failsafe mode works. To enter failsafe:
- Unplug the WAN connection
Powercycle the router
Wait for the Sys LED to begin blinking, then press the hardware button on the router
The Sys LED begins blinking faster to indicate that the router is in failsafe mode.
- You cannot connect to the router via SSH (puTTY), you must use telnet.
The IP address of the router will be 192.168.1.1
You must configure a static IP address on your host to connect to the router
Code: Select all
cd /overlay/usr/data
rm webmon_domains.txt
Instead, I COULD HAVE (but did not) reset the router to defaults:
Execute:
Code: Select all
mtd -r erase rootfs_data
Note too that the router is set to download lists of hosts to block at startup (in rc.local) as sugested in https://forum.openwrt.org/viewtopic.php?id=35023 This works very well--but does require about 1.3 meg of space at startup, that could contribute the problem.