Summary
Staging ingestor receives ~0.5 messages/min from mqtt2.wcmesh.com while mosquitto_sub against the SAME broker / SAME credentials / SAME topics from inside the SAME container receives ~112 messages/min — a 200× gap.
Repro on staging (current state)
# Inside the staging container, exact same credentials + topic patterns staging subscribes to:
docker exec corescope-staging-go timeout 30 mosquitto_sub \
-h mqtt2.wcmesh.com -p 8883 --tls-version tlsv1.2 --insecure \
-u <REDACTED> -P <REDACTED> \
-t "meshcore/SJC/+/packets" -t "meshcore/OAK/+/packets" \
-t "meshcore/PRB/+/packets" -t "meshcore/SFO/+/packets" \
-t "meshcore/MRY/+/packets" 2>&1 | grep raw | wc -l
# → 56 packets in 30s
# Compare ingestor's cumulative stats (same broker, same creds):
docker logs corescope-staging-go 2>&1 | grep "\[stats\]" | tail -1
# → tx_inserted=20 tx_dupes=44 obs_inserted=64 over 1h35m
# → ~44 messages received total in 95 minutes
Suspects (in priority order)
- paho CleanSession=true drops persistent subscriptions on reconnect; reconnect happens every ~5min on staging
- Subscription QoS mismatch: broker only delivers QoS 0 to non-persistent sessions; paho subscribes QoS 1 but session is non-persistent → broker drops
- Subscribe ACK race: paho considers subscribe "complete" before broker acks, messages in flight before ACK silently dropped
- Network MTU / TLS-fragment issue on the larger payloads — small status messages get through, larger packet messages don't (would explain ratio if packets are 800-byte; observers may need MTU diagnosis)
Acceptance
Out of scope
- Reorganizing the ingest pipeline architecture
- Switching MQTT libraries
- Investigating broker-side rate limits (probably not the cause given mosquitto_sub works)
Critical context
- Prod has same user authenticated to same broker, sees 21,733 pkts/hr (3 orders of magnitude higher than staging). This suggests staging's paho config is the differentiator, not the broker or auth.
- Staging container has NORMAL network (mosquitto_sub gets 56 in 30s)
- Watchdog detects + reconnects but doesn't help → reconnect re-subscribes but messages between reconnect cycles are missed
Summary
Staging ingestor receives ~0.5 messages/min from
mqtt2.wcmesh.comwhilemosquitto_subagainst the SAME broker / SAME credentials / SAME topics from inside the SAME container receives ~112 messages/min — a 200× gap.Repro on staging (current state)
Suspects (in priority order)
Acceptance
mosquitto_subuses (KeepAlive, CleanSession, AutoReconnect, MaxReconnectInterval, SubscriptionRetry)paho.SetOrderMatters(false)or equivalent for parallel processing if relevantpaho.SetCleanSession(false)+ persistent clientID, see if delivery rate matches mosquitto_subOut of scope
Critical context