Skip to content

Optimize javm recompiler performance #398

Description

@sorpaas

Goal

Improve the javm recompiler (x86-64 JIT) performance. Open-ended issue for incremental optimizations.

Architecture (capability-javm-v2)

The PVM uses a capability-based kernel with Harvard architecture:

  • Kernel (crates/javm/src/kernel.rs): dispatches ecalli, manages VM lifecycle (CREATE/CALL/REPLY), capability operations (MAP/UNMAP/SPLIT/GRANT/REVOKE)
  • Recompiler (crates/javm/src/recompiler/): x86-64 JIT, one compilation per CODE cap, shared across all VMs using that cap
  • Interpreter (crates/javm/src/interpreter/): pre-decoded bytecode interpreter, reference backend
  • Signal handler (recompiler/signal.rs): SIGSEGV-based memory bounds checking via guard pages + trap table
  • Gas metering (gas_sim.rs): per-basic-block pipeline gas simulation
  • Backing store (backing.rs): memfd-backed physical memory pool, MAP_SHARED into 4GB CODE windows

Key types

  • CodeCap: compiled PVM code + 4GB virtual window (shared across VMs)
  • VmInstance: register state + cap table + lifecycle state (u16 VM ID, max 65535)
  • JitContext: repr(C) struct at fixed offsets for JIT native code (regs, gas, memory ptr, PC)
  • live_ctx: optimization keeping JitContext alive across ecalli dispatch (avoids register copies)

Multi-VM execution

  • CALL(CODE) → CREATE child VM (cap bitmask propagation)
  • CALL(HANDLE/CALLABLE) → suspend caller, run target VM
  • REPLY → return to caller, restore gas
  • Context switch = register swap (all VMs sharing a CODE cap use the same 4GB window)
  • recompiler_resume_cap: fast JitContext re-entry when same VM + same code cap continues

Benchmark suite

# Single-VM workloads (8 benchmarks: fib, hostcall, sort, sieve, blake2b, keccak, ed25519, ecrecover)
cargo bench -p grey-bench --bench pvm_bench

# Multi-VM workload (fib_recur: recursive fibonacci via CREATE + CALL, 21K VMs)
cargo bench -p grey-bench --bench subvm_bench

# Grey-only (fast iteration)
cargo bench -p grey-bench -- 'grey-'

# PolkaVM comparison (pipeline gas metering)
POLKAVM_ALLOW_EXPERIMENTAL=1 POLKAVM_DEFAULT_COST_MODEL=full-l1-hit cargo bench -p grey-bench

Optimization areas

Code generation (recompiler/codegen.rs):

Multi-VM overhead (kernel.rs):

  • Context switch cost: register save/restore when switching between VMs
  • live_ctx optimization: currently disabled after VM switches — explore keeping it alive across CALL/REPLY when code_cap_id matches
  • VM allocation: Vec::push for new VMs — consider arena allocation
  • Cap table: 256 × Option per VM (~8KB) — consider sparse representation for VMs with few caps

Kernel dispatch (kernel.rs):

  • dispatch_ecalli is #[inline(always)] — verify this stays optimal
  • ProtocolCall resume path vs full segment rebuild

Compilation cost:

  • JIT compilation happens once per CODE cap, not per VM. Already amortized for multi-VM workloads.
  • For short-lived programs (blake2b, keccak), compilation overhead dominates — tracked separately

Rules

  • Always benchmark before AND after. Use criterion's built-in comparison.
  • If a change shows no measurable improvement or regresses, revert it.
  • Do not use polkavm or polkavm-common crates — implement from first principles.
  • Verify correctness: cargo test -p grey-bench checks a0 values and exact gas match between interpreter and recompiler.

Replaces #56.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions