Release prep v0.7.2: relay_node tracking (PR #45) + CRC_BAD packet drop (closes #34)#46
Merged
Conversation
Both text message and NodeInfo TX were ignoring the configured hop_limit and always sending with hop_limit=3/hop_start=3. Packets now carry the value from TransmitConfig, falling back to the constant if no config is present. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Surfaces the Meshtastic header relay_node byte (last byte of the last relay node's ID) through the full stack so the dashboard can show which node retransmitted each packet. - meshtastic_decoder: parse header byte 15 as relay_node - models/packet: add relay_node field and include in to_dict() - storage/database: add relay_node column + migration - storage/packet_repository: insert and restore relay_node - simple_packet_feed: show relay as "!src ↝ !relay" in source cell; resolve full short-ID from node registry when known - node_map: add drawFocusLine / clearFocusLine for persistent lines - app.js: wire packet-row focus to map line draw/clear; keep node registry in sync with 15-second refresh Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Bundled additions on top of the relay_node tracking work in PR #45: - tests/test_relay_node_header.py: locks the Meshtastic header byte 15 read into Packet.relay_node. Covers the static parser in isolation (extracts nonzero byte, treats zero as direct, full-byte-range round-trip, no bleed from byte 14) and the full decode() path (Packet.relay_node populated, defaults to 0, included in to_dict() for the WebSocket payload). - tests/test_database_migration.py: adds TestRelayNodeMigration with four cases for the packets.relay_node ALTER TABLE migration: fresh-install schema already includes the column, pre-PR-45 databases get it added on next connect(), the migration is idempotent across restarts (no duplicate column, no data loss), and rows that predated the column read back with relay_node=0 by default. Module docstring updated to cover both migrations. - frontend/css/dashboard.css: .relay-hop now uses var(--accent-amber) instead of var(--accent-yellow, #f0c040). The dashboard does not define an --accent-yellow token, so the hex fallback would have fired everywhere; --accent-amber (#f59e0b) is the canonical token already used by the position packet type, mid-band RSSI, and other amber accents on the page. - src/transmit/tx_service.py: renamed NODEINFO_HOP_LIMIT to DEFAULT_HOP_LIMIT and used it in both fallback expressions, so the text-message fallback no longer relies on a bare literal 3 while the NodeInfo fallback uses a named constant. The constant has no importers outside this file. Behavior unchanged (still 3). All 280 tests pass. ruff clean. No edge-runtime behavior changes.
SX1302Wrapper.receive() was logging the WARNING for STAT_CRC_BAD packets but still appending them to the returned packet list. The crc_ok=False field on ConcentratorPacket has never been read by any downstream code (concentrator_source, packet_router, decoders), so RF-corrupted bytes flowed all the way into MeshtasticDecoder.decode(). Three observable downstream symptoms produced by this: 1. Phantom node IDs in the local SQLite node table and the cloud DynamoDB node catalog. A bit flip in the source-ID field creates a "new node" that shares all-but-one bit with a real source ID. Visible in production logs as clusters like 4d8b18a1 / 4d8b98a1 / 7d8b18a9 / 7d8b98a9 (single-bit flips of one or two real IDs). 2. False ENCRYPTED packets. A corrupted channel-hash byte stops matching 0x08 (LongFast) and the packet is filed under "encrypted on a private channel we lack the key for." Real-world fingerprint: uniform-random distribution of channel hashes across hex space, inconsistent with actual private-channel popularity. 3. Garbled-but-readable text. AES-CTR XORs corrupted ciphertext with the keystream and produces mostly-correct plaintext with a few mangled characters. Reported as issue #34 ("3 hopr frOO\"Nc >> Mesa" instead of "3 hops from Costa Mesa"). Confirmed via private-repo git log that this predates v0.7.0: the pre-v0.7.0 source did not even define STAT_CRC_BAD, so no filtering was possible. v0.7.0 added the WARNING + crc_bad_count counter (RX diagnostic logging from the deferred core-module bundle) which made the symptoms newly visible. The append-anyway behavior has been live since the initial RAK2287 wiring commit. Fix is one line: continue after the CRC_BAD log. The diagnostic WARNING and crc_bad_count counter both keep working unchanged. STAT_CRC_OK and STAT_NO_CRC paths are deliberately untouched (NO_CRC packets pass through as before; if it turns out they should also be dropped, that is a follow-up decision with its own evidence). Tests: tests/test_sx1302_wrapper.py covers the filter contract (CRC_BAD dropped, CRC_OK passed, NO_CRC still passed, mixed batch returns only good packets, counter persists across calls, size==0 fast-path unchanged, not-started returns empty, lgw_receive error returns empty, request size matches LGW_PKT_MAX). 9 new tests, all passing alongside the existing 269 in the suite. Ruff clean. Closes #34.
8 tasks
- src/version.py: 0.7.1 -> 0.7.2 - config/default.yaml: device.firmware_version 0.7.1 -> 0.7.2 - README.md: version badge bump - docs/CHANGELOG.md: v0.7.2 entry covering CRC_BAD drop fix (closes #34), relay_node tracking, and the hop_limit TransmitConfig fix Hardware-validated on RAK V2 .141: - 14 historical phantom IDs in local DB match the bit-flip fingerprint of legitimate neighbors, all (NO NAME), all pre-fix - Zero phantoms registered since checkout (12/12 named arrivals over 2.5h) - One CRC_BAD WARNING fired during soak (sf7 bw125 rssi=-102 snr=-12.5), no decoder follow-on - confirms the new continue statement engages on real RF traffic Tests: 297 passed, ruff clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Prep branch for the v0.7.2 release. Bundles two changes that have been ready to ship since May 1:
fix/crc-bad-dropbranch, never opened as a PR)Version bump and CHANGELOG entry are deliberately not in this branch yet — they will land as a final commit once hardware validation on the test RAK is green. That keeps the fleet-wide update indicator (
__version__onmain) from firing until we've confirmed the bundle runs cleanly on real silicon.What's in this branch
Co-authored-by:trailers preserved per the May 4 attribution policy ("contributors choose their own attribution").1. Relay-node tracking (originally #45)
Surfaces the Meshtastic header
relay_nodebyte (the lowest byte of the last relay node's ID) through the decoder →Packetmodel → SQLite schema (withALTER TABLEmigration for existing installs) → WebSocket payload → dashboard packet feed. Clicking a row in the packet feed now draws a line on the map between the source and the relay, and the source cell shows!src ↝ !relaywith full short-ID resolution when the relay is in the local node registry.Real-world utility: kendel used this to trace a hop chain back to a rooftop node and discovered its ERP was bad.
Bonus fix in the same series: text-message and NodeInfo TX paths now read
hop_limitfromTransmitConfiginstead of hardcoding 3 in two places. Behavior unchanged for anyone running the default (still 3); fixes the silent "I set hop_limit in my yaml and it didn't take" bug.2. CRC_BAD packets dropped at the HAL boundary
SX1302Wrapper.receive()was logging the warning forSTAT_CRC_BADpackets but still appending them to the returned packet list. Thecrc_ok=Falsefield onConcentratorPackethas never been read by any downstream code (concentrator_source, packet_router, decoders), so RF-corrupted bytes flowed all the way intoMeshtasticDecoder.decode().Three observable downstream symptoms produced by this:
"3 hopr frOO\"Nc >> Mesa"symptom in Parsing Error on Some messages #34.This bug predates v0.7.0 (the pre-source-publication HAL didn't even define
STAT_CRC_BAD), but v0.7.0's RX diagnostic logging made it newly visible. The append-anyway behavior has been live since the initial RAK2287 wiring commit.Fix is one line:
continueafter the CRC_BAD log. The diagnostic WARNING andcrc_bad_countcounter both keep working unchanged.STAT_CRC_OKandSTAT_NO_CRCpaths are deliberately untouched.Closes #34.
Cloud-side evidence (informational)
A diagnostic audit on
meshradar.iothis morning found the cloudactive_nodes_24h = 18,374count is dominated by phantom IDs that match the CRC_BAD-passthrough fingerprint:long_nameAND no position (decryption fails on corrupted ciphertext → ENCRYPTED with no decoded payload)After this fix lands and the fleet rolls forward, expect
active_nodes_24hto drop sharply (rough estimate: from 18k toward 5–8k) over the following 24–48h as phantom IDs age out of the 24h window. We'll watch the trend post-deploy and decide whether to run a one-shot purge of pre-fix phantoms in DynamoDB or just let the 30-dayNODE_TTLclean them up naturally.Test plan
pytest tests/→ 297 passed, 1 pre-existing warning, 2.60sruff check .→ all checks passedgit checkout release/v0.7.2,sudo systemctl restart meshpoint, watch logs for clean startupMESHPOINT_DEBUG_RX=1for 5 minutes — confirmSTAT_CRC_BADlines no longer carry an associated phantom-node DB writerelay_nodepopulates correctly in the dashboard packet feedmeshpoint logsfor any new tracebacksRelease v0.7.2commit bumping__version__,default.yamlfirmware_version, README badge, andCHANGELOG. Push, watch CI, squash-merge.Authoring + attribution
Kendel's two commits on this branch (
1627b3fand2c77d91) carry his GitHub noreply email so his profile links correctly in the contributors graph + git blame, identical to the PR #38 squash on main. TheCo-Authored-By: Claude Sonnet 4.6trailers from his original commits are preserved.PR #45 was accidentally closed by the contributor on May 4 with a misdirected force-push that also wiped the work from his fork. Local
pr-45-kendelmccarleyretained everything, and these commits are cherry-picked from there. A linkback comment will be posted on the closed PR pointing here so the original contribution thread stays connected.