Skip to content

Conversation

@ana-pantilie
Copy link
Contributor

Needs #7334

Unisay and others added 13 commits November 13, 2025 19:06
Introduce new ExMemoryUsage wrapper that measures Value cost as the sum of logarithmic sizes (log of outer map size + log of max inner map size).

This reflects the O(log m + log k) cost behavior of two-level map lookups, based on experimental evidence showing lookup time scales linearly with the sum of depths rather than their maximum.

Modernizes integerLog2 usage by switching from GHC.Integer.Logarithms to GHC.Num.Integer.
Add ValueTotalSize and ValueLogOuterSizeAddLogMaxInnerSize to the DefaultUni builtin type system, enabling these wrappers to be used in builtin function signatures.

Both wrappers are coercions of the underlying Value type with specialized memory measurement behavior.
Add cost model parameters for four new Value-related builtins: LookupCoin (3 arguments), ValueContains (2 arguments), ValueData (1 argument), and UnValueData (1 argument).

Updates BuiltinCostModelBase type, memory models, cost model names, and unit cost models. Prepares infrastructure for actual cost models to be fitted from benchmarks.
Apply memory wrappers and cost model parameters to Value builtin denotations. LookupCoin wraps Value with ValueLogOuterSizeAddLogMaxInnerSize, ValueContains uses the wrapper for container and ValueTotalSize for contained value.

Replaces unimplementedCostingFun with actual cost model parameters. Updates golden type signatures to reflect wrapper types.
Add systematic benchmarking framework with worst-case test coverage: LookupCoin with 400 power-of-2 combinations testing BST depth range 2-21, ValueContains with 1000+ cases using multiplied_sizes model for x * y complexity.

Includes R statistical models: linearInZ for LookupCoin, multiplied_sizes for ValueContains to properly account for both container and contained sizes.
Update all three cost model variants (A, B, C) with parameters fitted from comprehensive benchmark runs. Includes extensive timing data covering full parameter ranges for all four Value builtins.

Models derived from remote benchmark runs on dedicated hardware with systematic worst-case test coverage ensuring conservative on-chain cost estimates.
Update test expectations across the codebase to reflect refined cost models: conformance test budgets (8 cases), ParamName additions for V1/V2/V3 ledger APIs (11 new params per version), param count tests, cost model registrations, and generator support.

All updates reflect the transition from placeholder costs to fitted models.
Document the addition of fitted cost model parameters for Value-related builtins based on comprehensive benchmark measurements.
Fix bug where worst-case entry could be duplicated in selectedEntries when it appears at a low position in allEntries (which happens for containers with small tokensPerPolicy values).

The issue occurred because the code took the first N-1 entries from allEntries and then appended worstCaseEntry, without checking if worstCaseEntry was already included in those first N-1 entries. For containers like 32768×2, the worst-case entry (policy[0], token[1]) is at position 1, so it was included in both the "others" list and explicitly appended, creating a duplicate.

Value.fromList deduplicates entries, resulting in benchmarks with one fewer entry than intended (e.g., 99 instead of 100), producing incorrect worst-case measurements.

Solution: Filter out worstCaseEntry from allEntries before taking the first N-1 entries, ensuring it only appears once at the end of the selected entries list.
Replace manual iteration + lookupCoin implementation with Data.Map.Strict's
isSubmapOfBy, which provides 2-4x performance improvement through:

- Parallel tree traversal instead of n₂ independent binary searches
- Better cache locality from sequential traversal
- Early termination on first mismatch
- Reduced function call overhead

Implementation change:
- Old: foldrWithKey + lookupCoin for each entry (O(n₂ × log(max(m₁, k₁))))
- New: isSubmapOfBy (isSubmapOfBy (<=)) (O(m₂ × k_avg) with better constants)

Semantic equivalence verified:
- Both check v2 ⊆ v1 using q2 ≤ q1 for all entries
- All plutus-core-test property tests pass (99 tests × 3 variants)
- Conformance tests show expected budget reduction (~50% CPU cost reduction)

Next steps:
- Re-benchmark with /costing:remote to measure actual speedup
- Re-fit cost model parameters (expect slope reduction from 6548 to ~1637-2183)
- Update conformance test budget expectations after cost model update

Credit: Based on optimization discovered by Kenneth.
Optimize generateConstrainedValueWithMaxPolicy to minimize off-path
map sizes while maintaining worst-case lookup guarantees:

1. Sort keys explicitly to establish predictable BST structure
2. Select maximum keys (last in sorted order) for worst-case depth
3. Populate only target policy with full token set (tokensPerPolicy)
4. Use minimal maps (1 token) for all other policies

Impact:
- 99.7% reduction in benchmark value size (524K → 1.5K entries)
- ~340× faster map construction during benchmark generation
- ~99.7% memory reduction (52 MB → 150 KB per value)
- Zero change to cost measurements (worst-case preserved)

Affects: LookupCoin, ValueContains benchmarks

Formula: totalEntries = tokensPerPolicy + (numPolicies - 1)
Example: 1024 policies × 512 tokens = 1,535 entries (was 524,288)

Rationale: BST lookups only traverse one path from root to leaf.
Off-path policies are never visited, so their inner map sizes don't
affect measurement. Reducing off-path maps from tokensPerPolicy to 1
eliminates 99.7% of irrelevant data without changing worst-case cost.

Technical details:
- ByteString keys already use worst-case comparison (28-byte prefix)
- Sorting + last selection guarantees maximum BST depth (rightmost leaf)
- Target policy still has full token set for worst-case inner lookup
- Validates correct behavior: build succeeds, benchmarks run normally
…ization

Update benchmark data and cost model parameters based on optimized
valueContains implementation using Map.isSubmapOfBy.

Benchmark results show significant performance improvement:
- Slope: 6548 → 1470 (4.5x speedup in per-operation cost)
- Intercept: 1000 → 1,163,050 (increased fixed overhead)

The slope reduction confirms the 3-4x speedup observed in local testing.
Higher intercept may reflect actual setup overhead in isSubmapOfBy or
statistical fitting on the new benchmark distribution.

Benchmark data: 1023 ValueContains measurements from GitHub Actions run
19367901303 testing the optimized implementation.
@ana-pantilie ana-pantilie force-pushed the ana/costing-builtin-value branch 2 times, most recently from 2ce3e1f to 6743c0e Compare November 19, 2025 21:24
@ana-pantilie ana-pantilie force-pushed the ana/costing-builtin-value branch from 9bdef5a to 2a702d1 Compare November 19, 2025 21:29
@ana-pantilie
Copy link
Contributor Author

Closed in favor of #7435

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants