Motivation
Sparse rate matrices indexed by the same axis twice are a recurring modelling pattern that op_system currently can't express compactly. The motivating case is the SMH Round 19 immune-ladder vaccination uptake (flepimop2-demo/configs/SMH_R19_op_system.yml, transitions section 5):
- from: X[age, vax=unvaccinated, loc, imm=X0]
to: X[age, vax=vaccinated, loc, imm=X3]
rate: eta_X0toX3 * nu[age]
- from: X[age, vax=unvaccinated, loc, imm=X0]
to: X[age, vax=vaccinated, loc, imm=X4]
rate: eta_X0toX4 * nu[age]
# ... 15 more
This expands to:
- 17 transition records, one per nonzero
(imm_from, imm_to) pair on the imm ladder.
- 17 sibling parameter declarations
eta_X{i}toX{j} whose joint structure (a sparse 11×11 imm × imm matrix) is invisible to op_system, the parameter plugin, and inference.
Other models will hit the same shape: contact matrices C[age, age], mobility/mixing kernels M[loc, loc], mutation/cross-immunity kernels K[strain, strain].
Proposed feature
Three composable additions:
1. Independent placeholders decoupled from axis names ($-prefix)
Today, X[age, vax] introduces wildcards whose names are the axis names, and rate-expression placeholders bind by axis name. This couples the placeholder identity to the axis identity and forbids using the same axis twice.
Introduce $-prefixed placeholders that are independent identifiers bound to a coord on a specified axis:
- from: X[age, vax=unvaccinated, loc, imm=$i]
to: X[age, vax=vaccinated, loc, imm=$j]
rate: eta[imm=$i, imm=$j] * nu[age]
$i and $j are two independent free variables, each ranging over imm.coords. Existing axis-name wildcards (X[age, vax]) continue to work as a shorthand equivalent to X[age=$age, vax=$vax].
2. Repeated-axis support in selectors
Allow the same axis name to appear more than once on a selector, provided each occurrence is bound to a distinct $-placeholder (or pinned to a literal coord). parse_selector currently rejects X[age, age=a0] outright; relax that to allow eta[imm=$i, imm=$j] while still rejecting unbound duplicates like X[imm, imm].
This requires auditing every consumer that builds an axis → coord mapping (gather, apply_along, initial_state expansion, operator framework) to ensure repeated axes don't conflate roles.
3. New parameter module: sparse_table
A parameter module whose values are indexed by a tuple of axis-coord pairs:
parameter:
eta:
module: sparse_table
indices: [imm, imm] # axis appears twice
values:
[X0, X3]: 0.0
[X0, X4]: 0.0
[X1, X4]: 0.0
# ... 14 more entries
The support set (declared keys) is structural metadata available at normalization time. Transitions whose rate references eta[imm=$i, imm=$j] iterate exactly the support set rather than the full Cartesian product:
transitions:
- from: X[age, vax=unvaccinated, loc, imm=$i]
to: X[age, vax=vaccinated, loc, imm=$j]
rate: eta[imm=$i, imm=$j] * nu[age]
The expander recognizes that eta is a sparse table over ($i, $j) and emits one transition per declared (i, j) key. Equivalently the user could write where: eta[imm=$i, imm=$j] != 0 explicitly; sparse-table presence in the rate would be a sugar for that.
A companion dense_table module (full Cartesian product) is the obvious complement for non-sparse matrices like contact matrices.
4. Plugin compatibility
The parameter plugin (when it lands) needs to populate sparse_table modules from CSV / parquet readins; the legacy R19 layout is one file per (i, j) pair, which maps naturally to one sparse-table entry each. Inference treats a single sparse_table as one structured target with priors over the support set rather than 17 independent scalars.
Scope and risk
Touches:
src/op_system/_templates.py — selector grammar ($-placeholders, repeated-axis support).
src/op_system/_normalize.py — _collect_transition_wildcard_axes, _render_transition_combo, _expand_single_transition, _expand_initial_state_templates, gather/apply_along consumers.
- New parameter module(s) and registry hook for
sparse_table / dense_table.
- Documentation: README + spec reference updates.
- Migration of
flepimop2-demo/configs/SMH_R19_op_system.yml (17 transitions + 17 eta_* scalars → 1 template + 1 sparse table).
Risks / open questions:
- Grammar lock-in. The
$-prefix syntax is a one-shot decision; worth validating against 2-3 example configs (R19 ladder, contact matrix, mutation kernel) before shipping.
- Repeated-axis ripples. Every consumer of axis-coord assignments must be audited. Some may need to switch from
dict[axis, coord] to list[(axis, coord)] internally.
- Sparse iteration semantics. Support set is structural (fixed at normalization), not runtime. Document this explicitly so users don't expect a rate's runtime zero values to prune transitions.
- Inference design coupling. Treating a sparse table as one target with structured priors is a real win for inference, but the precise interface should land with or after the parameter-inference work, not before.
Recommended sequencing
- Land the design here (grammar + module shape) on paper. Don't write code yet.
- Wait for the parameter-inference work to stabilize so we can co-design the inference interface.
- Implement in a dedicated PR; migrate R19 in the same PR as a working example.
- Follow up with contact-matrix / mixing-kernel example configs.
Alternatives considered
- Status quo (Option A): Leave R19 with 17 explicit records. Honest representation of sparse irregular data, but doesn't generalize and leaves
eta invisible to inference as a joint object.
for_each: template list (Option B): Compress the transitions block to one template + a row list, but the parameter block still has 17 disconnected scalars. Smaller change but doesn't address the inference-interface or matrix-parameter cases.
Option C (this issue) is the right long-term answer specifically because contact matrices and mutation kernels make repeated-axis matrix parameters near-certain in future models.
Motivation
Sparse rate matrices indexed by the same axis twice are a recurring modelling pattern that op_system currently can't express compactly. The motivating case is the SMH Round 19 immune-ladder vaccination uptake (
flepimop2-demo/configs/SMH_R19_op_system.yml, transitions section 5):This expands to:
(imm_from, imm_to)pair on theimmladder.eta_X{i}toX{j}whose joint structure (a sparse 11×11imm × immmatrix) is invisible to op_system, the parameter plugin, and inference.Other models will hit the same shape: contact matrices
C[age, age], mobility/mixing kernelsM[loc, loc], mutation/cross-immunity kernelsK[strain, strain].Proposed feature
Three composable additions:
1. Independent placeholders decoupled from axis names (
$-prefix)Today,
X[age, vax]introduces wildcards whose names are the axis names, and rate-expression placeholders bind by axis name. This couples the placeholder identity to the axis identity and forbids using the same axis twice.Introduce
$-prefixed placeholders that are independent identifiers bound to a coord on a specified axis:$iand$jare two independent free variables, each ranging overimm.coords. Existing axis-name wildcards (X[age, vax]) continue to work as a shorthand equivalent toX[age=$age, vax=$vax].2. Repeated-axis support in selectors
Allow the same axis name to appear more than once on a selector, provided each occurrence is bound to a distinct
$-placeholder (or pinned to a literal coord).parse_selectorcurrently rejectsX[age, age=a0]outright; relax that to alloweta[imm=$i, imm=$j]while still rejecting unbound duplicates likeX[imm, imm].This requires auditing every consumer that builds an
axis → coordmapping (gather, apply_along, initial_state expansion, operator framework) to ensure repeated axes don't conflate roles.3. New parameter module:
sparse_tableA parameter module whose values are indexed by a tuple of axis-coord pairs:
The support set (declared keys) is structural metadata available at normalization time. Transitions whose rate references
eta[imm=$i, imm=$j]iterate exactly the support set rather than the full Cartesian product:The expander recognizes that
etais a sparse table over($i, $j)and emits one transition per declared(i, j)key. Equivalently the user could writewhere: eta[imm=$i, imm=$j] != 0explicitly; sparse-table presence in the rate would be a sugar for that.A companion
dense_tablemodule (full Cartesian product) is the obvious complement for non-sparse matrices like contact matrices.4. Plugin compatibility
The parameter plugin (when it lands) needs to populate
sparse_tablemodules from CSV / parquet readins; the legacy R19 layout is one file per(i, j)pair, which maps naturally to one sparse-table entry each. Inference treats a singlesparse_tableas one structured target with priors over the support set rather than 17 independent scalars.Scope and risk
Touches:
src/op_system/_templates.py— selector grammar ($-placeholders, repeated-axis support).src/op_system/_normalize.py—_collect_transition_wildcard_axes,_render_transition_combo,_expand_single_transition,_expand_initial_state_templates, gather/apply_along consumers.sparse_table/dense_table.flepimop2-demo/configs/SMH_R19_op_system.yml(17 transitions + 17eta_*scalars → 1 template + 1 sparse table).Risks / open questions:
$-prefix syntax is a one-shot decision; worth validating against 2-3 example configs (R19 ladder, contact matrix, mutation kernel) before shipping.dict[axis, coord]tolist[(axis, coord)]internally.Recommended sequencing
Alternatives considered
etainvisible to inference as a joint object.for_each:template list (Option B): Compress the transitions block to one template + a row list, but the parameter block still has 17 disconnected scalars. Smaller change but doesn't address the inference-interface or matrix-parameter cases.Option C (this issue) is the right long-term answer specifically because contact matrices and mutation kernels make repeated-axis matrix parameters near-certain in future models.