Skip to content

fix: accept subscript and attribute targets in unpacking, for-loop, and comprehension contexts#473

Open
mbeschastn0v wants to merge 2 commits into
pydantic:mainfrom
mbeschastn0v:mb/issue-408-unpack-subscript-attr
Open

fix: accept subscript and attribute targets in unpacking, for-loop, and comprehension contexts#473
mbeschastn0v wants to merge 2 commits into
pydantic:mainfrom
mbeschastn0v:mb/issue-408-unpack-subscript-attr

Conversation

@mbeschastn0v

@mbeschastn0v mbeschastn0v commented May 28, 2026

Copy link
Copy Markdown
Contributor

Closes #408.

Summary

CPython allows subscript (x[i]) and attribute (o.a) expressions as targets inside tuple unpacking, for-loop variables, and comprehension for clauses. Monty's parse_unpack_target_impl only accepted Name/Tuple/Starred/List and rejected everything else with SyntaxError: invalid unpacking target: <kind>. That breaks:

  • x[0], x[1] = x[1], x[0] (the reported case)
  • x[0], = (1,)
  • o.a, o.b = 1, 2 (attribute targets — same parse arm, wider than the issue title)
  • for box[0] in iter: / for o.a in iter:
  • [... for box[0] in iter] and the same shape in comprehensions

This PR fixes all five contexts in a single parse-site change plus the matching prepare/compile wiring.

Design

UnpackTarget gains two new leaf variants Subscript { target, index, target_position } and Attribute { object, attr, target_position }, mirroring AssignTarget::Subscript/Attr exactly. The two-enum split between AssignTarget (top-level assignment LHS) and UnpackTarget (inside-an-unpack pattern) is intentional grammar typing — Starred is only valid inside unpack, Unpack is only valid at the top — and the fix preserves that. The leaves themselves are duplicated between the two enums (matching how Name is duplicated today); a follow-up could extract a shared LeafAssignTarget if a third enum ever needs the same leaves, but YAGNI for now.

Implementation notes

  • Compiler stack invariant: compile_unpack_target and process_unpack_sim delegate to emit_subscript_store/emit_attr_store. Each variant's net operand-stack effect is −1, identical to Name/Starred, so the existing per-target loop and compile_comp_target_unpack's leaf enumeration work unchanged. In the comprehension simulator the sim ↔ operand-stack 1:1 invariant is preserved by shrinking sim in lock-step (the Pending is popped and nothing pushed back — no sentinel needed). Traced through Tuple { sub_0=Subscript, sub_1, sub_2 } cases to confirm.

  • CPython evaluation order: RHS unpacks first, then each target's sub-expressions evaluate at store time. Locked in by the side-effecting index test in unpack__ops.py.

  • Silent-correctness fix: collect_cell_vars_from_node and collect_referenced_names_from_node previously ignored Node::For/Node::With targets — correct then, since UnpackTarget had no sub-expressions. After this change the targets can carry expressions that reference enclosing variables, so two new helpers collect_{cell_vars,referenced_names}_from_unpack_target are wired into both For and With arms.

  • Comp-var slot allocation: prepare_unpack_target_for_comprehension's new arms intentionally do NOT allocate comp-var slots. Subscript/attribute targets mutate existing state, they don't introduce a binding — their sub-expressions resolve against the surrounding scope via the normal prepare_expression path.

  • Walrus inside subscript index: a walrus (i := 0) inside the index expression of a subscript target correctly binds in the outer scope — collect_names_from_unpack_target walks the sub-expressions via collect_assigned_names_from_expr to pick it up. Locked in by the walrus in subscript index test.

Test plan

  • cargo test -p monty --features memory-model-checks (all rust tests pass, 968 cases)
  • cargo test -p monty --test parse_errors --features memory-model-checks (63 cases, including new negative snapshots)
  • cargo run -p monty-datatest --features memory-model-checks unpack__ / iter__for_loop / comprehension__ (all green)
  • make test-py (1091 Python binding tests pass)
  • CPython parity confirmed for every new test case
  • make format-rs && make lint-rs && make lint-py clean
  • Pre-commit hooks pass

Tests added

File Coverage
crates/monty/test_cases/unpack__ops.py subscript-target assignment unpack: single, two, swap, nested in tuple, name siblings, starred middle, walrus in index, side-effecting index
crates/monty/test_cases/iter__for_loop_unpacking.py for x[i] in iter: last-value, computed index re-evaluated each iteration, body visibility, nested tuple
crates/monty/test_cases/comprehension__all.py comp for clause with subscript targets, nested tuple subscript, leak-out semantics
crates/monty/test_cases/unpack__attr.py (new) attribute targets across assignment / swap / nested / starred / for-loop / comprehension via make_mutable_point()
crates/monty/tests/parse_errors.rs replaced for_loop_attribute_target_has_clean_message (the snapshot that locked the now-fixed rejection) with unpack_target_constant_rejected / unpack_target_call_rejected to lock the upstream ruff rejection for genuinely-invalid LHS shapes

Out of scope

  • Refactoring the leaf overlap between AssignTarget and UnpackTarget into a shared LeafAssignTarget — preserved the existing architecture; happy to split that out if reviewers prefer.
  • Augmented assignment (x[0] += 1) — separate code path via AugAssign, unchanged.

Summary by cubic

Allow subscript (x[i]) and attribute (obj.a) targets in unpacking, for-loops, and comprehensions, matching CPython. Fixes #408 and unblocks swaps, nested patterns, and comprehension targets.

  • Bug Fixes
    • Parser/AST: Add UnpackTarget::Subscript and UnpackTarget::Attribute (with target_position); parse arms mirror assign-target shapes.
    • Compiler: Delegate stores to emit_subscript_store/emit_attr_store; net stack effect −1; preserve CPython eval order; comprehension sim shrinks in lock-step.
    • Prepare/Scope: Recurse into sub-expressions; no comp-var slots for these leaves; walk For/With targets for cell-var and referenced-name analysis; walrus inside indices binds in outer scope.
    • Tests: Cover assignment, for-loops, and comprehensions (subs/attrs: swaps, nested, starred, side effects, leak-out); add lambda-in-subscript-index cases to exercise closure capture in for targets and nested tuple targets; update parse-error snapshots to lock upstream rejections for constant/call LHS shapes.

Written for commit 51a2d41. Summary will update on new commits.

Review in cubic

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 9 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="crates/monty/test_cases/iter__for_loop_unpacking.py">

<violation number="1" location="crates/monty/test_cases/iter__for_loop_unpacking.py:79">
P2: Test for subscript index re-evaluation is non-discriminating and cannot detect incorrect caching</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

# Subscript target with computed index — index is re-evaluated each iteration
a = [0, 0]
i = 0
for a[i] in (10, 20):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Test for subscript index re-evaluation is non-discriminating and cannot detect incorrect caching

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/monty/test_cases/iter__for_loop_unpacking.py, line 79:

<comment>Test for subscript index re-evaluation is non-discriminating and cannot detect incorrect caching</comment>

<file context>
@@ -64,3 +64,33 @@
+# Subscript target with computed index — index is re-evaluated each iteration
+a = [0, 0]
+i = 0
+for a[i] in (10, 20):
+    # i never changes, so both iterations write the same slot
+    pass
</file context>

@codspeed-hq

codspeed-hq Bot commented May 28, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 17 untouched benchmarks
⏩ 15 skipped benchmarks1


Comparing mbeschastn0v:mb/issue-408-unpack-subscript-attr (51a2d41) with main (10ab8bd)

Open in CodSpeed

Footnotes

  1. 15 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@codecov

codecov Bot commented May 28, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 77.39130% with 26 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/monty/src/prepare.rs 80.68% 9 Missing and 8 partials ⚠️
crates/monty/src/bytecode/compiler.rs 68.42% 2 Missing and 4 partials ⚠️
crates/monty/src/parse.rs 62.50% 0 Missing and 3 partials ⚠️

📢 Thoughts on this report? Let us know!

@mbeschastn0v mbeschastn0v force-pushed the mb/issue-408-unpack-subscript-attr branch from e566b4f to 3de7147 Compare May 28, 2026 19:47
@mbeschastn0v

Copy link
Copy Markdown
Contributor Author

Thanks @cubic-dev-ai — valid catch. The for a[i] in (10, 20) test with i = 0 never reassigned i inside the body, so a == [20, 0] was satisfied by both the correct re-evaluation behavior and the bad cache-the-index alternative.

Fixed in 3de7147 by mutating i inside the loop body so the second store relies on re-evaluation. The discriminating outcome is now a == [10, 20] (vs. [20, 0] if the index were cached at loop entry).

The comprehension counterpart in comprehension__all.py had the same non-discriminating shape but cannot be made discriminating without a body to mutate i in — I rephrased its assertion message to drop the "re-evaluated" claim and just state what it actually verifies (subscript store on each emitted value).

Force-pushed: also trimmed verbose comments/docstrings throughout the patch per separate maintainer feedback.

@cubic-dev-ai

cubic-dev-ai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Thanks @cubic-dev-ai — valid catch. The for a[i] in (10, 20) test with i = 0 never reassigned i inside the body, so a == [20, 0] was satisfied by both the correct re-evaluation behavior and the bad cache-the-index alternative.

Fixed in 3de7147 by mutating i inside the loop body so the second store relies on re-evaluation. The discriminating outcome is now a == [10, 20] (vs. [20, 0] if the index were cached at loop entry).

The comprehension counterpart in comprehension__all.py had the same non-discriminating shape but cannot be made discriminating without a body to mutate i in — I rephrased its assertion message to drop the "re-evaluated" claim and just state what it actually verifies (subscript store on each emitted value).
...

@mbeschastn0v Looks good — the discriminating test is clean. First iteration: i=0a[0]=10, second: i=1a[1]=20, yielding [10, 20]. If the index were cached, both would hit a[0] and you'd get [20, 0]. Solid.

The comprehension side makes sense too — no body means no place to mutate i, so just testing that the store fires on each emitted value is the right scope.

And the comment/docstring trim fits what you added to CLAUDE.md — the new coverage on collect_cell_vars_from_unpack_target/collect_referenced_names_from_unpack_target has tighter inline docs that say what matters (why For/With targets were previously skipped, what changed) without restating the obvious.

Tip: get faster answers by chatting with cubic’s review copilot in the review UI

@mbeschastn0v

mbeschastn0v commented May 28, 2026

Copy link
Copy Markdown
Contributor Author

@samuelcolvin
Architectural question for follow-up consideration:
Before this PR, AssignTarget and UnpackTarget shared one variant (Name). With the two new leaves this PR adds, they now share three: Name, Subscript, Attribute. That tips the balance toward extracting a shared leaf type, e.g.:

pub enum LeafAssignTarget {
      Name(Identifier),
      Subscript { target: ExprLoc, index: ExprLoc, target_position: CodeRange },
      Attribute { object: ExprLoc, attr: EitherStr, target_position: CodeRange },
 }
 
pub enum AssignTarget {
      Leaf(LeafAssignTarget),
      Unpack { targets: Vec<UnpackTarget>, targets_position: CodeRange },
}
 
pub enum UnpackTarget {
      Leaf(LeafAssignTarget),
      Tuple { targets: Vec<Self>, position: CodeRange },
      Starred(Identifier),
}

Wins: store-dispatch centralizes (one emit_leaf_store instead of three parallel match arms in compile_assign_target / compile_unpack_target / process_unpack_sim); collector helpers shrink similarly. The grammar split you encoded — Starred exclusive to unpack, Unpack exclusive to top-level assign — stays intact.
Cost: every leaf match becomes two-level (UnpackTarget::Leaf(LeafAssignTarget::Subscript { … })), which is patterns getting noisier even when total LoC drops.
I left it out of this PR to keep the diff scoped to the bug fix, but happy to either bundle it in here or open a follow-up — whichever you prefer. Or skip entirely if you'd rather keep the flat enums; the duplication is real but not large.

mbeschastn0v and others added 2 commits May 28, 2026 16:06
CPython allows subscript (`x[i]`) and attribute (`o.a`) expressions as
targets inside tuple unpacking, `for`-loop variables, and comprehension
`for` clauses. Monty's `parse_unpack_target_impl` only accepted
`Name`/`Tuple`/`Starred`/`List` and rejected everything else with
`SyntaxError: invalid unpacking target: <kind>`, breaking common
patterns like `x[0], x[1] = x[1], x[0]`, `o.a, o.b = 1, 2`,
`for box[0] in iter:`, and `[... for box[0] in iter]`.

The `UnpackTarget` enum gains `Subscript { target, index, target_position }`
and `Attribute { object, attr, target_position }` leaves, mirroring the
existing `AssignTarget::Subscript`/`Attr` shape. The two-enum split
between `AssignTarget` and `UnpackTarget` is preserved — `Starred`
remains exclusive to unpack and `Unpack` remains exclusive to top-level
assignment, encoding Python's grammar in the type system.

Implementation:
- Parser: two new arms in `parse_unpack_target_impl` mirror the
  matching arms in `parse_assign_target`.
- Prepare: `prepare_unpack_target` and `prepare_unpack_target_for_comprehension`
  recurse into the sub-expressions via `prepare_expression`. The new
  variants intentionally do NOT allocate comp-var slots — they mutate
  existing state rather than introducing a binding, so sub-expression
  names resolve against the surrounding scope.
- Scope analysis: `collect_names_from_unpack_target` walks
  sub-expressions only to pick up walrus targets (no leaf binding).
  Two new helpers `collect_cell_vars_from_unpack_target` and
  `collect_referenced_names_from_unpack_target` mirror the existing
  `*_from_assign_target` helpers. Both are wired into `Node::For` and
  `Node::With` arms, which previously ignored the target entirely
  (correct then, since targets had no sub-expressions — a silent
  correctness bug after this change without the new walks).
- Compiler: `compile_unpack_target` and `process_unpack_sim` delegate
  to `emit_subscript_store`/`emit_attr_store`. Each variant's net
  operand-stack effect is -1, identical to `Name`/`Starred`, so the
  existing per-target loop and `compile_comp_target_unpack`'s leaf
  enumeration work unchanged. For the comprehension path the
  sim <-> operand-stack 1:1 invariant is preserved by shrinking sim
  in lock-step (no placeholder pushed back).

CPython evaluation order is preserved: the RHS is unpacked first,
then each target's sub-expressions are evaluated at store time. The
new test `unpack__ops.py::side-effecting index` locks this in.

Tests:
- `crates/monty/test_cases/unpack__ops.py` — subscript-target assignment
  unpack (single target, two targets, swap, nested, starred siblings,
  walrus in index, side-effecting index).
- `crates/monty/test_cases/iter__for_loop_unpacking.py` — `for x[i] in iter:`
  with last-value semantics, computed indices, body visibility, nested
  tuple targets.
- `crates/monty/test_cases/comprehension__all.py` — comprehension
  `for` clause with subscript targets, nested forms, leak-out semantics.
- `crates/monty/test_cases/unpack__attr.py` — full attribute-target
  coverage (assignment, swap, nested, starred, for-loop, comprehension)
  via the existing `make_mutable_point()` external fixture.
- `crates/monty/tests/parse_errors.rs` — replaced the
  `for_loop_attribute_target_has_clean_message` snapshot (locked
  the previous rejection) with `unpack_target_constant_rejected` and
  `unpack_target_call_rejected` which lock the upstream ruff rejection
  for genuinely-invalid LHS shapes.

Closes pydantic#408.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two tests added to `unpack__ops.py` that drive the cell-var and
referenced-name walks introduced for `Node::For` / `Node::With` targets:

- `_for_target_lambda` puts a closure-capturing lambda inside the
  index of a for-loop subscript target. Without the walker added in
  `collect_cell_vars_from_unpack_target` /
  `collect_referenced_names_from_unpack_target`, the captured outer
  local would be invisible to scope analysis when nothing else in the
  enclosing function references it.

- `_for_target_tuple_lambda` wraps the same pattern in a nested tuple
  target, exercising the `Tuple` recursion arm in both walkers.

Together these close the codecov gap on the defense-in-depth scope
walks that fire only when a subscript/attribute target's
sub-expression contains a closure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mbeschastn0v mbeschastn0v force-pushed the mb/issue-408-unpack-subscript-attr branch from 409384c to 51a2d41 Compare May 28, 2026 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Subscript targets fail in unpacking assignment

1 participant