Performance: Complex{Num} coefficients and QTerm encoding slow products/powers vs QuantumAlgebra.jl

A head-to-head benchmark (script: `benchmark/quantumalgebra_comparison.jl`) shows SecondQuantizedAlgebra (SQA) is **faster than QuantumAlgebra (QA) on building/transforming Hamiltonians and on commutators**, but **slower on predominantly-bosonic products and powers** (many-mode `H²`, Fock `(a·a†)ⁿ`, multi-spin `Hⁿ`).


<img width="1019" height="1055" alt="Image" src="https://github.com/user-attachments/assets/9f8e406a-d002-4e08-994f-9403f253bc96" />

Representative numbers (Julia 1.12, SQA v0.5.1, QA v1.6.0; `QA/SQA` < 1 means QA faster):

| benchmark | SQA | QA | QA/SQA |
|---|---:|---:|---:|
| JC build H | 7.5 µs | 30.4 µs | 4.1× |
| nested `[H,σ]` depth 8 | 5.7 ms | 17.9 ms | 3.1× |
| JC `Hⁿ` n=4 | 1.08 ms | 2.48 ms | 2.3× |
| Fock `(a·a†)¹⁰` | 1.37 ms | 0.25 ms | 0.2× |
| many-mode `H²` M=16 | 12.5 ms | 2.56 ms | 0.2× |
| Dicke(3) `H⁴` | 23.5 ms | 4.86 ms | 0.2× |

**Root cause.** SQA stores every prefactor as `CNum = Complex{Symbolics.Num}` and routes all coefficient arithmetic through Symbolics. Measured per-op cost: `Num*Num` ≈ 700 ns, `Num(x)` construction ≈ 168 ns, vs native `Int`/`ComplexF64` ≈ 1 ns. The tax is in `SymbolicUtils` hashconsing (global cache insert + tree hash on every construction), confirmed by profiling and by `BasicSymbolic*BasicSymbolic` ≈ 701 ns being identical to `Num*Num` — i.e. it is **not** the `Num` wrapper (dropping it to raw `BasicSymbolic` would keep ~97% of the cost and add type instability). It is paid per term during product/power expansion, **even for integer coefficients** (Fock has integer coeffs yet still pays it, because Symbolics wraps numeric literals as hash-consed constants). QA avoids this entirely: its dict value is a native `Number`, and symbolic parameters live in the *term key* (`QuTerm.params::Vector{Param}`), so `ω·J` is a vector merge, not a CAS multiply.

**Profiling attribution** (share of active CPU, idle threads filtered):

| | coefficient (SymbolicUtils) | operator machinery | `QTerm` hashing |
|---|---:|---:|---:|
| many-mode `H²` M=16 (symbolic coeffs) | 70% | 20% | 10% |
| Fock `(a·a†)¹⁰` (integer coeffs) | 30% | 50% | 20% |

**Conclusion — two independent levers, neither sufficient alone:**

1. **Native numeric coefficients** — make `CNum` a single concrete struct holding a native number, escalating to `Complex{Num}` only when a free symbol is genuinely present (materialize back to `Num` only at the `substitute`/`average`/print boundaries). Stays type-stable (concrete struct + type-preserving arithmetic). Biggest lever for symbolic-coefficient workloads (the 70% above), projected ~5× → ~1.5× of QA on many-mode/Dicke/SW. Contained to `cnum.jl`. **Does not** fix Fock (only 30% coefficient there). To also speed genuinely-symbolic coeffs like QA, the symbolic part would need a lightweight monomial-of-named-params form (QA's model) with full `Num` as fallback.

2. **Compact operator/term encoding** — even with coefficients free, SQA's operator-machinery floor still exceeds QA's *total* (many-mode `H²` ~3.8 ms vs QA 2.56 ms; Fock ~0.95 ms vs QA 0.25 ms). SQA stores terms as `QTerm(ops::Vector{QSym}, ne)` (heap vector of field structs, re-hashed per dict insert); QA uses near-isbits operators in a compact `BaseOpProduct` with an integer/Levi-Civita exchange. Cheaper/cached `QTerm` hashing, isbits operators, and fewer per-term allocations would lower this floor.


**Related.** Surfaced from #163 (benchmark against QuantumAlgebra). Lever 2 (compact term/operator encoding) overlaps existing optimization issues: #137 (hash-cons operator leaves), #141 (cache `uses_phys_key` as a `QTerm` field), #140 (`SmallDict` for short `QAdd` sums), #139 (optimize the `ne` / diagonal-split machinery). 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance: Complex{Num} coefficients and QTerm encoding slow products/powers vs QuantumAlgebra.jl #164

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

benchmark	SQA	QA	QA/SQA
JC build H	7.5 µs	30.4 µs	4.1×
nested `[H,σ]` depth 8	5.7 ms	17.9 ms	3.1×
JC `Hⁿ` n=4	1.08 ms	2.48 ms	2.3×
Fock `(a·a†)¹⁰`	1.37 ms	0.25 ms	0.2×
many-mode `H²` M=16	12.5 ms	2.56 ms	0.2×
Dicke(3) `H⁴`	23.5 ms	4.86 ms	0.2×

	coefficient (SymbolicUtils)	operator machinery	`QTerm` hashing
many-mode `H²` M=16 (symbolic coeffs)	70%	20%	10%
Fock `(a·a†)¹⁰` (integer coeffs)	30%	50%	20%

Performance: Complex{Num} coefficients and QTerm encoding slow products/powers vs QuantumAlgebra.jl #164

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions