fix(normalize): fail closed on invalid ChatGPT mapping trees by GaosCode · Pull Request #329 · MemPalace/mempalace

GaosCode · 2026-04-09T04:44:34Z

Closes #330

What does this PR do?

Fix ChatGPT mapping normalization so imports fail closed instead of silently ingesting the wrong transcript or raw invalid JSON.

This PR:

stops walking children[0] and resolves the transcript from the active node path instead
uses current_node when present, and rejects ambiguous multi-branch trees without it
rejects invalid ChatGPT exports when mapping is malformed, current_node is invalid, the resolved path does not connect back to the detected root, or the path is not reachable from the root via children
prevents mine_convos() from silently passing invalid ChatGPT exports downstream as plain text
updates ingest reporting so skipped invalid ChatGPT exports are counted and labeled correctly
adds regression tests for regenerated branches, edited branches, invalid current_node, orphaned subtrees, unreachable hidden nodes, and ingest skip reporting

How to test

uv run pytest /Users/mrbrain/code/mempalace/tests/test_normalize.py -q
uv run pytest /Users/mrbrain/code/mempalace/tests/test_convo_miner.py -q -k "ambiguous or invalid or summary"
uv run ruff check .

Checklist

Tests pass (python -m pytest tests/ -v)
No hardcoded paths
Linter passes (ruff check .)

web3guru888

Solid fix — the old children[0] walk was a silent data corruption vector and this is the right way to handle it.

What I like:

Fail closed on invalid data is the correct design choice. The previous behavior (return None, let mine_convos() silently pass the raw JSON downstream as plain text) meant corrupted memories could enter the palace without any signal. Raising ChatGPTNormalizeError and counting skips in the summary output makes failures visible.
Path resolution from current_node back to root is the correct algorithm — it follows the active branch exactly as ChatGPT intended, rather than guessing via children[0]. The reachability check (path must be reachable from root via children edges) catches both orphan subtrees and nodes that only have parent pointers without being listed as children.
Exception hierarchy (ChatGPTNormalizeError → ChatGPTBranchAmbiguityError) is clean — callers can catch broadly or narrowly.
Test coverage is thorough: regenerated branches, edited branches, invalid current_node, orphans, unreachable hidden nodes, too-few-messages, and ingest reporting. The multi-turn edit branch test (test_chatgpt_mapping_uses_active_edit_branch_path) is a particularly good scenario.

Minor observation (not blocking):

_collect_chatgpt_reachable_ids gets called twice for multi-branch trees without current_node — once inside _collect_chatgpt_leaf_ids and once in the main _try_chatgpt_json before _build_chatgpt_path. Could cache the result, but for typical conversation sizes this is negligible.

GaosCode · 2026-04-10T02:07:31Z

Rebased/merged latest main and resolved conflicts. Targeted tests still pass.

GaosCode · 2026-04-13T05:17:46Z

Hi @bensig, quick follow-up on this PR.

I updated it on top of the latest develop, resolved the merge conflicts, and re-ran the targeted tests locally. It should be ready for review now when you have time.

Thanks!

GaosCode · 2026-04-15T12:49:54Z

Hi @igorls, quick follow-up on this PR.

I’ve updated the branch, resolved the latest merge conflicts, and re-ran the targeted tests locally. Could you please take a look when you have time?

Thanks!

web3guru888 approved these changes Apr 9, 2026

View reviewed changes

GaosCode force-pushed the fix/chatgpt-mapping-active-branch branch from 83d209b to ad9e46f Compare April 11, 2026 06:56

GaosCode requested review from bensig and milla-jovovich as code owners April 11, 2026 06:56

bensig changed the base branch from main to develop April 11, 2026 22:22

bensig requested a review from igorls as a code owner April 11, 2026 22:22

GaosCode force-pushed the fix/chatgpt-mapping-active-branch branch from ad9e46f to acf0923 Compare April 13, 2026 02:58

igorls added area/mining File and conversation mining bug Something isn't working labels Apr 14, 2026

fix(normalize): fail closed on invalid ChatGPT mapping trees

309b1ca

GaosCode force-pushed the fix/chatgpt-mapping-active-branch branch from acf0923 to 309b1ca Compare April 15, 2026 12:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(normalize): fail closed on invalid ChatGPT mapping trees#329

fix(normalize): fail closed on invalid ChatGPT mapping trees#329
GaosCode wants to merge 1 commit intoMemPalace:developfrom
GaosCode:fix/chatgpt-mapping-active-branch

GaosCode commented Apr 9, 2026 •

edited

Loading

Uh oh!

web3guru888 left a comment

Uh oh!

GaosCode commented Apr 10, 2026

Uh oh!

GaosCode commented Apr 13, 2026

Uh oh!

GaosCode commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

GaosCode commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

How to test

Checklist

Uh oh!

web3guru888 left a comment

Choose a reason for hiding this comment

Uh oh!

GaosCode commented Apr 10, 2026

Uh oh!

GaosCode commented Apr 13, 2026

Uh oh!

GaosCode commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

GaosCode commented Apr 9, 2026 •

edited

Loading