fix(normalize): fail closed on invalid ChatGPT mapping trees#329
fix(normalize): fail closed on invalid ChatGPT mapping trees#329GaosCode wants to merge 1 commit intoMemPalace:developfrom
Conversation
web3guru888
left a comment
There was a problem hiding this comment.
Solid fix — the old children[0] walk was a silent data corruption vector and this is the right way to handle it.
What I like:
-
Fail closed on invalid data is the correct design choice. The previous behavior (return
None, letmine_convos()silently pass the raw JSON downstream as plain text) meant corrupted memories could enter the palace without any signal. RaisingChatGPTNormalizeErrorand counting skips in the summary output makes failures visible. -
Path resolution from
current_nodeback to root is the correct algorithm — it follows the active branch exactly as ChatGPT intended, rather than guessing viachildren[0]. The reachability check (path must be reachable from root via children edges) catches both orphan subtrees and nodes that only have parent pointers without being listed as children. -
Exception hierarchy (
ChatGPTNormalizeError→ChatGPTBranchAmbiguityError) is clean — callers can catch broadly or narrowly. -
Test coverage is thorough: regenerated branches, edited branches, invalid
current_node, orphans, unreachable hidden nodes, too-few-messages, and ingest reporting. The multi-turn edit branch test (test_chatgpt_mapping_uses_active_edit_branch_path) is a particularly good scenario.
Minor observation (not blocking):
_collect_chatgpt_reachable_ids gets called twice for multi-branch trees without current_node — once inside _collect_chatgpt_leaf_ids and once in the main _try_chatgpt_json before _build_chatgpt_path. Could cache the result, but for typical conversation sizes this is negligible.
|
Rebased/merged latest main and resolved conflicts. Targeted tests still pass. |
83d209b to
ad9e46f
Compare
ad9e46f to
acf0923
Compare
|
Hi @bensig, quick follow-up on this PR. I updated it on top of the latest Thanks! |
acf0923 to
309b1ca
Compare
|
Hi @igorls, quick follow-up on this PR. I’ve updated the branch, resolved the latest merge conflicts, and re-ran the targeted tests locally. Could you please take a look when you have time? Thanks! |
Closes #330
What does this PR do?
Fix ChatGPT
mappingnormalization so imports fail closed instead of silently ingesting the wrong transcript or raw invalid JSON.This PR:
children[0]and resolves the transcript from the active node path insteadcurrent_nodewhen present, and rejects ambiguous multi-branch trees without itmappingis malformed,current_nodeis invalid, the resolved path does not connect back to the detected root, or the path is not reachable from the root viachildrenmine_convos()from silently passing invalid ChatGPT exports downstream as plain textcurrent_node, orphaned subtrees, unreachable hidden nodes, and ingest skip reportingHow to test
Checklist
Tests pass (python -m pytest tests/ -v)
No hardcoded paths
Linter passes (ruff check .)