Every Meshpoint runs a small standalone service called network-watchdog
alongside the main meshpoint service. Its only job is to notice when WiFi
silently dies and try to bring it back without you having to drive to the
device. This page explains what it does, when it acts, and how to tune it.
For the install/service plumbing see Install Script. For unrelated WiFi setup issues see Common Errors > WiFi and networking.
The watchdog runs as network-watchdog.service (separate from meshpoint.service
so a meshpoint crash never disables WiFi recovery). On startup it disables
WiFi power save on wlan0, then enters a loop:
- Every 120 seconds, ping the default gateway.
- If the gateway does not reply, fall back to pinging
8.8.8.8. - If both fail, count it as one consecutive failure.
- After 3 consecutive failures (about 6 minutes), restart the
wlan0interface (Stage 1 recovery). - Reboot escalation (Stage 2) is disabled by default as of v0.6.5. See Re-enabling auto-reboot below.
A successful ping at any point resets the counter to zero. The fallback
to 8.8.8.8 exists because some consumer routers and captive-portal
networks block ICMP to the gateway itself, which would otherwise cause
the watchdog to think WiFi was down when it was actually fine.
All thresholds live as constants at the top of
/opt/meshpoint/scripts/network_watchdog.py:
| Setting | Default | Meaning |
|---|---|---|
CHECK_INTERVAL_SECONDS |
120 |
Seconds between connectivity checks |
PING_TIMEOUT_SECONDS |
5 |
How long to wait for each ping reply |
RESTART_THRESHOLD |
3 |
Consecutive failures before Stage 1 (interface restart) |
REBOOT_THRESHOLD |
0 |
Consecutive failures before Stage 2 (system reboot). 0 disables it |
WIFI_INTERFACE |
wlan0 |
Interface to restart on Stage 1 |
FALLBACK_PING_TARGET |
8.8.8.8 |
Used when gateway does not reply |
With defaults, the watchdog will restart wlan0 after roughly 6 minutes
of no connectivity and never reboot the Pi on its own.
Earlier versions used REBOOT_THRESHOLD = 6, which would reboot the Pi
after 6 consecutive failures (about 12 minutes offline). On networks where
the gateway blocks ICMP, this caused infinite reboot loops: the watchdog
could not tell the difference between "WiFi is broken" and "router does
not answer pings", so it would reboot, come back up, fail to ping the
gateway, and reboot again forever.
v0.6.5 fixes both halves of that problem:
- The fallback to
8.8.8.8means the watchdog only counts a failure when the Pi truly cannot reach the internet, not just when one specific host refuses to reply. REBOOT_THRESHOLD = 0disables the reboot escalation entirely, so even if something exotic causes the fallback path to also fail, the Pi will not nuke itself.
Stage 1 (interface restart) is unchanged and still fires at 3 failures. That covers the actual common case: WiFi driver wedged, association lost, DHCP lease expired without renewal. Restarting the interface fixes those without a full reboot.
If you want the old behavior back (for example: headless deployments with no physical access where a reboot is the lesser evil):
sudo nano /opt/meshpoint/scripts/network_watchdog.pyChange:
REBOOT_THRESHOLD = 0to:
REBOOT_THRESHOLD = 6Then restart the service:
sudo systemctl restart network-watchdogThe startup log line will confirm the new policy:
INFO network-watchdog: Starting network watchdog (interface=wlan0, restart=3, reboot=6)
When REBOOT_THRESHOLD = 0, that field reads reboot=disabled.
Note that git pull will not overwrite your edit unless the file changes
upstream. If a future release modifies network_watchdog.py, git will
either merge cleanly or flag a conflict so you know to re-apply your
threshold.
Live tail:
sudo journalctl -u network-watchdog -fLast 50 lines:
sudo journalctl -u network-watchdog -n 50Confirm it is running:
sudo systemctl status network-watchdogYou should see Active: active (running). If it is inactive or
failed, restart it:
sudo systemctl restart network-watchdogTo see the per-ping detail (which target was pinged, gateway vs fallback),
temporarily switch the log level. Edit /opt/meshpoint/scripts/network_watchdog.py
and change level=logging.INFO to level=logging.DEBUG, then
sudo systemctl restart network-watchdog. Revert when you are done so the
journal does not grow with low-value entries.
If you are debugging WiFi yourself and do not want the watchdog restarting the interface from under you:
sudo systemctl stop network-watchdog
sudo systemctl disable network-watchdogRe-enable later with:
sudo systemctl enable --now network-watchdogThe main meshpoint service does not depend on the watchdog. Stopping
the watchdog does not affect packet capture, the dashboard, or anything
else.
The watchdog only cares about WiFi (wlan0). It does not monitor:
- Ethernet (
eth0): if you are wired in, the watchdog still runs and pings, but Stage 1 recovery restartswlan0, noteth0. On wired-only deployments the watchdog is effectively a no-op. - The cloud uplink to
meshradar.io: that is handled by the upstream WebSocket client inside the main meshpoint service, which has its own reconnect with exponential backoff. - The local dashboard: served by the main meshpoint service, not the watchdog.
If any of those break, look at the main service log (meshpoint logs),
not the watchdog log.