Skip to content

perf: Store instruction as compact integer, extract fields on demand #2310

@gabrielbosio

Description

@gabrielbosio

perf: Store instruction as compact integer, extract fields on demand

Motivation

Currently, decode_instruction eagerly extracts all 12 fields from the encoded
instruction into a struct with separate fields. This has costs:

  • Decode time: All fields are extracted upfront, even if not all are used
    in a given code path
  • Memory: Instruction struct is ~48+ bytes vs 16 bytes for a u128
    wrapper

An alternative approach is to store the raw encoded instruction and extract
fields via bit operations only when accessed. This trades slightly higher field
access cost for lower decode cost and smaller memory footprint.

Current implementation

pub struct Instruction {
    pub off0: isize,
    pub off1: isize,
    pub off2: isize,
    pub dst_register: Register,
    pub op0_register: Register,
    pub op1_addr: Op1Addr,
    pub res: Res,
    pub pc_update: PcUpdate,
    pub ap_update: ApUpdate,
    pub fp_update: FpUpdate,
    pub opcode: Opcode,
    pub opcode_extension: OpcodeExtension,
}

Proposed implementation

pub struct Instruction(u128);

impl Instruction {
    pub fn offset0(self) -> isize { /* bit extraction */ }
    pub fn dst_register(self) -> Register { /* bit extraction */ }
    pub fn opcode_extension(self) -> OpcodeExtension { /* bit extraction */ }
    // ... etc
}

Prior art

PR #1762 attempted this optimization but is now outdated and has bugs:

  • Used u64 encoding, but main now uses u128 to support Stwo opcodes
  • Missing OpcodeExtension field (Blake, BlakeFinalize, QM31Operation)
  • Incorrect bit masks for ap_update (used 3 bits instead of 2)
  • Wrong handling of ApUpdate::Add2 (it's derived from opcode == Call, not
    encoded separately)

Current instruction encoding

From decoder.rs:

// opcode_extension|   opcode|ap_update|pc_update|res_logic|op1_src|op0_reg|dst_reg
//  79 ... 17 16 15| 14 13 12|    11 10|  9  8  7|     6  5|4  3  2|      1|      0
  • Bits 0-47: Three 16-bit offsets (off0, off1, off2)
  • Bits 48-62: Flags (dst_reg, op0_reg, op1_src, res_logic, pc_update,
    ap_update, opcode)
  • Bit 63+: Opcode extension (0=Stone, 1=Blake, 2=BlakeFinalize, 3=QM31Operation)

Tasks

  • Implement Instruction as a u128 wrapper with accessor methods
  • Implement TryFrom<u128> with proper validation (matching current
    decode_instruction error cases)
  • Handle derived fields correctly:
    • ApUpdate::Add2 when opcode == Call && ap_update_num == 0
    • Res::Unconstrained when res_logic == 0 && pc_update == Jnz
    • FpUpdate derived from Opcode
  • Validate Stwo opcode constraints (Blake flags, QM31 flags)
  • Update call sites to use accessor methods instead of field access
  • Benchmark to verify performance improvement
  • Consider adding #[inline] hints on accessors

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePerformance-related improvements or regressions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions