Skip to content

[ty] Narrow semantic node ID lookup tables#26363

Draft
charliermarsh wants to merge 1 commit into
mainfrom
charlie/narrow-semantic-node-id-maps
Draft

[ty] Narrow semantic node ID lookup tables#26363
charliermarsh wants to merge 1 commit into
mainfrom
charlie/narrow-semantic-node-id-maps

Conversation

@charliermarsh

@charliermarsh charliermarsh commented Jun 25, 2026

Copy link
Copy Markdown
Member

Summary

File-local semantic IDs are stored at their full width in several immutable lookup tables, even though almost every file uses fewer than 65,536 IDs.

This adds an independent NarrowNodeIndexMap. When every value fits, it stores each mapping as a packed six-byte entry containing a 32-bit node index and a 16-bit semantic ID, avoiding the padding of a (NodeKey, u16) tuple. Unusually large files fall back to the existing full-width FrozenMap representation. This PR does not depend on #26350.

We use the narrow representation for AST use IDs and scope lookups. Scope lookups keep module, primary-scope, and type-parameter-scope entries separate because a node can anchor both a primary scope and a type-parameter scope.

@charliermarsh charliermarsh added performance Potential performance improvement ty Multi-file analysis & type inference labels Jun 25, 2026
@astral-sh-bot

astral-sh-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown

Typing conformance results

No changes detected ✅

Current numbers
The percentage of diagnostics emitted that were expected errors held steady at 94.47%. The percentage of expected errors that received a diagnostic held steady at 89.19%. The number of fully passing files held steady at 95/134.

@astral-sh-bot

astral-sh-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown

Memory usage report

Summary

Project Old New Diff Outcome
flake8 29.02MB 28.88MB -0.49% (144.46kB) ⬇️
trio 70.45MB 70.15MB -0.42% (304.46kB) ⬇️
sphinx 167.01MB 166.42MB -0.35% (603.98kB) ⬇️
prefect 449.07MB 447.24MB -0.41% (1.83MB) ⬇️

Significant changes

Click to expand detailed breakdown

flake8

Name Old New Diff Outcome
semantic_index 7.86MB 7.72MB -1.79% (144.46kB) ⬇️

trio

Name Old New Diff Outcome
semantic_index 17.73MB 17.44MB -1.68% (304.46kB) ⬇️

sphinx

Name Old New Diff Outcome
semantic_index 37.37MB 36.78MB -1.58% (603.98kB) ⬇️

prefect

Name Old New Diff Outcome
semantic_index 114.18MB 112.36MB -1.60% (1.83MB) ⬇️

@astral-sh-bot

astral-sh-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown

ecosystem-analyzer results

No diagnostic changes detected ✅

Full report with detailed diff (timing results)

@charliermarsh charliermarsh force-pushed the charlie/narrow-semantic-node-id-maps branch from cb78a8e to 00a5263 Compare June 25, 2026 14:28
@charliermarsh charliermarsh force-pushed the charlie/codex-compact-semantic-node-maps branch from 2233529 to 7215cd9 Compare June 25, 2026 14:32
@charliermarsh charliermarsh marked this pull request as ready for review June 25, 2026 14:58
@charliermarsh charliermarsh requested a review from a team as a code owner June 25, 2026 14:58

@MichaReiser MichaReiser left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll wait for the conclusion on the base PR before reviewing this PR (the improvements look promising but we may need to move/change the PR depending on the outcome of the base PR)

@charliermarsh charliermarsh force-pushed the charlie/narrow-semantic-node-id-maps branch from 00a5263 to 5062fa2 Compare June 26, 2026 14:17
@charliermarsh charliermarsh requested review from a team, BurntSushi and dhruvmanila as code owners June 26, 2026 14:17
@charliermarsh charliermarsh changed the base branch from charlie/codex-compact-semantic-node-maps to main June 26, 2026 14:17
@charliermarsh charliermarsh removed request for a team June 26, 2026 14:20
@charliermarsh charliermarsh removed request for a team, BurntSushi and dhruvmanila June 26, 2026 14:20
@charliermarsh charliermarsh marked this pull request as draft June 26, 2026 14:20
@charliermarsh charliermarsh force-pushed the charlie/narrow-semantic-node-id-maps branch from 5062fa2 to e8c35b9 Compare June 26, 2026 14:35
Store indexed semantic IDs as 16-bit values when every entry fits, with a full-width fallback for unusually large files. Apply the narrow representation to AST use IDs and scope lookups, partitioning primary and type-parameter scopes so distinct scope keys that share a node index remain separate.
@charliermarsh charliermarsh force-pushed the charlie/narrow-semantic-node-id-maps branch from e8c35b9 to 42ab9af Compare June 26, 2026 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Potential performance improvement ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants