Skip to content

Repeated-axis selectors and sparse 2D parameter tables #88

@jc-macdonald

Description

@jc-macdonald

Motivation

Sparse rate matrices indexed by the same axis twice are a recurring modelling pattern that op_system currently can't express compactly. The motivating case is the SMH Round 19 immune-ladder vaccination uptake (flepimop2-demo/configs/SMH_R19_op_system.yml, transitions section 5):

- from: X[age, vax=unvaccinated, loc, imm=X0]
  to:   X[age, vax=vaccinated,   loc, imm=X3]
  rate: eta_X0toX3 * nu[age]
- from: X[age, vax=unvaccinated, loc, imm=X0]
  to:   X[age, vax=vaccinated,   loc, imm=X4]
  rate: eta_X0toX4 * nu[age]
# ... 15 more

This expands to:

  • 17 transition records, one per nonzero (imm_from, imm_to) pair on the imm ladder.
  • 17 sibling parameter declarations eta_X{i}toX{j} whose joint structure (a sparse 11×11 imm × imm matrix) is invisible to op_system, the parameter plugin, and inference.

Other models will hit the same shape: contact matrices C[age, age], mobility/mixing kernels M[loc, loc], mutation/cross-immunity kernels K[strain, strain].

Proposed feature

Three composable additions:

1. Independent placeholders decoupled from axis names ($-prefix)

Today, X[age, vax] introduces wildcards whose names are the axis names, and rate-expression placeholders bind by axis name. This couples the placeholder identity to the axis identity and forbids using the same axis twice.

Introduce $-prefixed placeholders that are independent identifiers bound to a coord on a specified axis:

- from: X[age, vax=unvaccinated, loc, imm=$i]
  to:   X[age, vax=vaccinated,   loc, imm=$j]
  rate: eta[imm=$i, imm=$j] * nu[age]

$i and $j are two independent free variables, each ranging over imm.coords. Existing axis-name wildcards (X[age, vax]) continue to work as a shorthand equivalent to X[age=$age, vax=$vax].

2. Repeated-axis support in selectors

Allow the same axis name to appear more than once on a selector, provided each occurrence is bound to a distinct $-placeholder (or pinned to a literal coord). parse_selector currently rejects X[age, age=a0] outright; relax that to allow eta[imm=$i, imm=$j] while still rejecting unbound duplicates like X[imm, imm].

This requires auditing every consumer that builds an axis → coord mapping (gather, apply_along, initial_state expansion, operator framework) to ensure repeated axes don't conflate roles.

3. New parameter module: sparse_table

A parameter module whose values are indexed by a tuple of axis-coord pairs:

parameter:
  eta:
    module: sparse_table
    indices: [imm, imm]              # axis appears twice
    values:
      [X0, X3]: 0.0
      [X0, X4]: 0.0
      [X1, X4]: 0.0
      # ... 14 more entries

The support set (declared keys) is structural metadata available at normalization time. Transitions whose rate references eta[imm=$i, imm=$j] iterate exactly the support set rather than the full Cartesian product:

transitions:
  - from: X[age, vax=unvaccinated, loc, imm=$i]
    to:   X[age, vax=vaccinated,   loc, imm=$j]
    rate: eta[imm=$i, imm=$j] * nu[age]

The expander recognizes that eta is a sparse table over ($i, $j) and emits one transition per declared (i, j) key. Equivalently the user could write where: eta[imm=$i, imm=$j] != 0 explicitly; sparse-table presence in the rate would be a sugar for that.

A companion dense_table module (full Cartesian product) is the obvious complement for non-sparse matrices like contact matrices.

4. Plugin compatibility

The parameter plugin (when it lands) needs to populate sparse_table modules from CSV / parquet readins; the legacy R19 layout is one file per (i, j) pair, which maps naturally to one sparse-table entry each. Inference treats a single sparse_table as one structured target with priors over the support set rather than 17 independent scalars.

Scope and risk

Touches:

  • src/op_system/_templates.py — selector grammar ($-placeholders, repeated-axis support).
  • src/op_system/_normalize.py_collect_transition_wildcard_axes, _render_transition_combo, _expand_single_transition, _expand_initial_state_templates, gather/apply_along consumers.
  • New parameter module(s) and registry hook for sparse_table / dense_table.
  • Documentation: README + spec reference updates.
  • Migration of flepimop2-demo/configs/SMH_R19_op_system.yml (17 transitions + 17 eta_* scalars → 1 template + 1 sparse table).

Risks / open questions:

  • Grammar lock-in. The $-prefix syntax is a one-shot decision; worth validating against 2-3 example configs (R19 ladder, contact matrix, mutation kernel) before shipping.
  • Repeated-axis ripples. Every consumer of axis-coord assignments must be audited. Some may need to switch from dict[axis, coord] to list[(axis, coord)] internally.
  • Sparse iteration semantics. Support set is structural (fixed at normalization), not runtime. Document this explicitly so users don't expect a rate's runtime zero values to prune transitions.
  • Inference design coupling. Treating a sparse table as one target with structured priors is a real win for inference, but the precise interface should land with or after the parameter-inference work, not before.

Recommended sequencing

  1. Land the design here (grammar + module shape) on paper. Don't write code yet.
  2. Wait for the parameter-inference work to stabilize so we can co-design the inference interface.
  3. Implement in a dedicated PR; migrate R19 in the same PR as a working example.
  4. Follow up with contact-matrix / mixing-kernel example configs.

Alternatives considered

  • Status quo (Option A): Leave R19 with 17 explicit records. Honest representation of sparse irregular data, but doesn't generalize and leaves eta invisible to inference as a joint object.
  • for_each: template list (Option B): Compress the transitions block to one template + a row list, but the parameter block still has 17 disconnected scalars. Smaller change but doesn't address the inference-interface or matrix-parameter cases.

Option C (this issue) is the right long-term answer specifically because contact matrices and mutation kernels make repeated-axis matrix parameters near-certain in future models.

Metadata

Metadata

Assignees

No one assigned

    Labels

    architectureStructural or design-level changesenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions