EVM Opcode Summarization System Technical Report

## Executive Summary

The EVM Opcode Summarization System is a framework developed within the KEVM ([evm-semantics](https://github.com/runtimeverification/evm-semantics)) project to automatically generate summarized, single-step execution rules for Ethereum Virtual Machine opcodes. This system addresses the critical need for efficient formal verification by condensing complex multi-step opcode executions into atomic, provably correct summary rules. The methodology of this work will be provided in the corresponding academic publication.

## Summarization Workflow

The summarization process follows a systematic three-phase approach, as documented in the `KEVMSummarizer` class:

### Phase 1: Specification Building (`build_spec`)

The first phase constructs formal specifications for symbolic execution of each opcode:

#### 1.1 Initial State Construction
```python
def _build_spec(self, op: str, stack_needed: int, ...):
    # Construct initial EVM state
    _init_subst['K_CELL'] = KSequence([next_opcode, KVariable('K_CELL')])
    _init_subst['WORDSTACK_CELL'] = KEVM.wordstack(stack_needed)
    _init_subst['GAS_CELL'] = KEVM.inf_gas(KVariable('GAS_CELL', 'Gas'))
```

#### 1.2 Opcode-Specific Handling
The system categorizes opcodes and builds specialized specifications as follows:

- **Account Query Opcodes** (`BALANCE`, `EXTCODESIZE`, etc.):
  ```python
  if op in ACCOUNT_QUERIES_OPCODES:
      # Create both normal and otherwise cases
      specs.append((..., '_NORMAL'))
      specs.append((..., '_OWISE'))
  ```

- **Storage Opcodes** (`SLOAD`, `SSTORE`, `TLOAD`, `TSTORE`):
  ```python
  elif op in ACCOUNT_STORAGE_OPCODES:
      cell, constraint = accounts_cell('ID_CELL')
      init_subst['ACCOUNTS_CELL'] = cell
  ```

- **Jump Operations** (`JUMP`, `JUMPI`):
  ```python
  elif op == 'JUMPI':
      # Create separate specs for true/false conditions
      specs.append((..., '_FALSE'))
      specs.append((..., '_TRUE'))
  ```

#### 1.3 Constraint Generation
The system generates constraints for:
- Stack underflow prevention
- Gas cost calculations
- State consistency checks
- Account existence conditions

### Phase 2: KCFG Exploration (`explore`)

The second phase performs symbolic execution to explore all possible execution paths:

#### 2.1 Execution Environment Setup
```python
def explore(self, proof: APRProof) -> bool:
    with legacy_explore(
        self.kevm,
        kcfg_semantics=KEVMSemantics(allow_symbolic_program=True),
        kore_rpc_command=('kore-rpc-booster',),
        max_depth=1,  # Single-step execution
        max_iterations=300,
    ) as kcfg_explore:
```

#### 2.2 Proof Execution
The system runs the formal prover with a specific configuration:
- **Max depth**: 1 (A debugging value for the number of rewriting steps per iteration; the typical setting is 1000.)
- **Max iterations**: 300  (A debugging value chosen to limit the number of symbolic execution iterations, allowing for a quicker assessment of the verification results and an early decision on proof continuity.)
- **Cut point rules**: All disabled for atomic execution
- **Terminal rules**: Enabled for completion detection

#### 2.3 Result Validation
Each proof is validated to ensure:
- No pending nodes (incomplete execution)
- No stuck nodes (execution failures)
- No bounded nodes (depth limitations)
- Proper terminal states

### Phase 3: Rule Summarization (`summarize`)

The final phase transforms the KCFG into clean, usable summary rules:

#### 3.1 KCFG Minimization
```python
def summarize(self, proof: APRProof, merge: bool = False):
    proof.minimize_kcfg(KEVMSemantics(allow_symbolic_program=True), merge)
```

#### 3.2 Rule Transformation Pipeline
The system applies multiple transformations to generate clean rules:

1. **Variable Normalization**: `_transform_dash()`
   - Removes underscore prefixes from variable names
   - Standardizes variable naming conventions

2. **Gas Handling**: `_transform_inf_gas()`
   - Converts infinite gas to regular gas calculations
   - Adds appropriate gas constraint guards

3. **Account State Cleanup**: `_transform_dot_account_var()`
   - Simplifies account cell representations
   - Removes redundant account existence checks

4. **Function Abstraction**: `_transform_lhs_functions()`
   - Converts complex LHS functions to variables
   - Adds corresponding equality constraints

#### 3.3 Rule Generation
```python
def _to_rules(self, proof: APRProof) -> list[KRule]:
    # Convert KCFG to K rules
    module = APRProofShow(...).kcfg_show.to_module(proof.kcfg)
    for krule in module.sentences:
        # Apply transformations
        body, requires, ensures = _transform_dash(...)
        rule_id = _transform_rule_id(proof.id, requires)
        # Generate final rule
        krules.append(KRule(body, requires, ensures, atts))
```

#### 3.4 File Generation
The system generates two types of output files:

1. **Individual Summary Files**: `{opcode}-summary.k`
   - Contains rules specific to one opcode
   - Includes necessary imports and dependencies

2. **Master Summary File**: `summaries.k`
   - Imports all individual summary files
   - Provides unified access to all summaries

### Workflow Integration

#### Entry Point Function
```python
def summarize(opcode_symbol: str) -> tuple[KEVMSummarizer, list[APRProof]]:
    summarizer = KEVMSummarizer(proof_dir, save_directory)
    proofs = summarizer.build_spec(opcode_symbol)
    for proof in proofs:
        if proof_exists:
            proof = APRProof.read_proof_data(proof_dir, proof.id)
        else:
            summarizer.explore(proof)  # Phase 2
        summarizer.summarize(proof)    # Phase 3
    return summarizer, proofs
```

#### Batch Processing
```python
def batch_summarize(num_processes: int = 4):
    with Pool(processes=num_processes) as pool:
        pool.map(_process_opcode, get_passed_opcodes())
```

### Quality Assurance in Workflow

#### Validation Checks
At each phase, the system performs validation:

1. **Specification Phase**: 
   - Stack underflow prevention
   - Constraint consistency
   - State validity

2. **Exploration Phase**:
   - Proof completion
   - Node state validation
   - Execution path coverage

3. **Summarization Phase**:
   - Rule correctness
   - Transformation validity
   - File generation success

#### Error Handling
The system includes comprehensive error handling:
```python
def _process_opcode(opcode: str) -> None:
    try:
        summarize(opcode)
        _LOGGER.info(f'Successfully processed opcode: {opcode}')
    except Exception as e:
        _LOGGER.error(f'Failed to process opcode {opcode}: {str(e)}')
        _LOGGER.debug(traceback.format_exc())
```

### Performance Optimizations

#### Caching Strategy
- Existing proofs are reused when available
- Proof data is persisted to disk
- Incremental processing supported

#### Parallel Processing
- Multiple opcodes processed simultaneously
- Configurable process pool size
- Independent opcode processing

#### Memory Management
- KCFG minimization reduces memory usage
- Temporary data cleanup after processing
- Efficient rule transformation pipelines

This three-phase workflow ensures that each EVM opcode is systematically analyzed, formally verified, and transformed into efficient, single-step execution rules that maintain correctness while optimizing performance for downstream verification tools.

## System Architecture Overview

### Core Components

The summarization functionality is concentrated in four main components:

1. **Core Engine**: `summarizer.py` (943 lines) - The main implementation
2. **CLI Integration**: `cli.py` and `__main__.py` - Command-line interface and execution
3. **Testing Framework**: `test_summarize.py` and `test_prove.py` - Validation and verification
4. **Generated Summaries**: `/summaries` directory - 68 individual opcode summary files

### Additional Supporting Components

Based on the commit analysis, the system also includes:
- **KEVM Helper Methods**: Extended functionality in the core KEVM system
- **EDSL Modules**: Domain-specific language support for summary generation
- **Build Configuration**: Integration with the K build system through `kdist/plugin.py`
- **Wrapper Functions**: Utility functions for opcode processing
- **Gas Cost Analysis**: Support for gas cost summarization across all opcodes

## Technical Architecture

### 1. KEVMSummarizer Class

The core `KEVMSummarizer` class provides three main functionalities:

```python
class KEVMSummarizer:
    def build_spec(self, op: str) -> list[APRProof]:
        """Build specifications for symbolically executing an opcode."""
    
    def explore(self, proof: APRProof) -> bool:
        """Execute the specification to explore the KCFG."""
    
    def summarize(self, proof: APRProof, merge: bool = False) -> None:
        """Minimize the KCFG to get summarized rules."""
```

### 2. Opcode Coverage and Status

The system tracks 68 different EVM opcodes with their implementation status:

```python
OPCODES: Final = frozendict({
    'STOP': KApply('STOP_EVM_NullStackOp'),
    'ADD': KApply('ADD_EVM_BinStackOp'),
    'MUL': KApply('MUL_EVM_BinStackOp'),
    # ... additional 65 opcodes
})
```

Each opcode has an associated status in `OPCODES_SUMMARY_STATUS`, tracking whether it has been successfully summarized.

### 3. Specialized Processing Categories

The system categorizes opcodes into specialized processing groups:

- **Account Query Opcodes**: `BALANCE`, `EXTCODESIZE`, `EXTCODEHASH`, `EXTCODECOPY`
- **Storage Opcodes**: `SLOAD`, `SSTORE`, `TLOAD`, `TSTORE`
- **Gas Usage Opcodes**: Special handling for gas cost calculations
- **Stack Manipulation**: `DUP`, `SWAP`, `LOG` with custom stack validation

### 4. Proof Generation and Validation

The system generates formal proofs for each opcode through:

1. **Specification Building**: Creates symbolic execution specifications
2. **KCFG Exploration**: Explores all possible execution paths
3. **Proof Minimization**: Reduces complex execution graphs to summary rules
4. **Validation**: Verifies correctness through automated testing

## Implementation Details

### Stack Underflow Prevention

The system implements sophisticated stack underflow checking:

```python
def stack_needed(opcode: str) -> int:
    """Return the stack size needed for the opcode."""
    
def stack_delta(opcode: str) -> int | None:
    """Return the stack delta for the opcode."""
```

### Gas Cost Integration

Gas cost summarization is supported across all opcodes with special handling for:
- Berlin hard fork compatibility
- Use-gas vs no-gas scenarios
- Dynamic gas calculations

### Account State Management

The system provides advanced account state management:

```python
def accounts_cell(acct_id: str | KInner, exists: bool = True) -> tuple[KInner, KInner]:
    """Construct an account cell map with constraints."""
```

## Generated Outputs

### Summary Files Structure

The system generates 68 individual summary files (`.k` files) in the `/summaries` directory:

- **Individual Opcode Summaries**: e.g., `add-summary.k`, `mul-summary.k`
- **Master Summary File**: `summaries.k` that imports all individual summaries
- **Specialized Categories**: Balance operations have separate normal and otherwise cases

### Rule Transformation Pipeline

The system applies multiple transformations to generate clean summary rules:

1. **Variable Name Normalization**: `_transform_dash()` 
2. **Gas Handling**: `_transform_inf_gas()` - Converts infinite gas to regular gas
3. **Account State Cleanup**: `_transform_dot_account_var()` 
4. **Function Abstraction**: `_transform_lhs_functions()` - Converts LHS functions to variables

## CLI Integration

### Command Structure

The system provides a comprehensive CLI interface:

- Summarize specific opcode: `kevm-pyk summarize --opcode ADD`
- Summarize all supported opcodes: `kevm-pyk summarize`
- Clear existing proofs: `kevm-pyk summarize --clear`

### Build Targets

The system integrates with K's build system through specialized targets:

- `summary`: Haskell-based summary generation
- `llvm-summary`: LLVM-based summary generation  
- `haskell-summary`: Haskell-based summary validation

## Testing and Validation

### Test Categories

1. **Unit Tests**: `test_summarize.py` - Tests individual opcode summarization
2. **Integration Tests**: `test_prove.py` - Validates summary correctness
3. **Proof Validation**: Each generated summary is proven correct

### Validation Criteria

For each opcode, the system validates:
- No pending, failing, bounded, or stuck nodes
- Single successor from initial node
- Terminal or covered end states
- Proof correctness through formal verification

## Performance Optimization

### Parallel Processing

The system supports parallel processing for batch operations:

```python
def batch_summarize(num_processes: int = 4) -> None:
    """Parallelize the summarization of opcodes."""
```

### Configuration Optimizations

Based on the commit analysis, the system includes several performance optimizations:
- **Depth Limiting**: Max depth reduced to 1 for efficiency
- **Upstream Integration**: Uses optimized `llvm_interpret` function
- **Code Elimination**: Removed 871 lines of redundant code in optimization phases

## Quality Assurance

### Code Standards

The system maintains high code quality through:
- **Standardized Labeling**: Consistent naming conventions across 71 files
- **Variable Naming**: Systematic use of `_XXX` for unused variables
- **Comprehensive Testing**: Each opcode has corresponding test coverage

### Error Handling

Robust error handling for:
- Stack underflow conditions
- Gas calculation edge cases
- Account state inconsistencies
- Proof generation failures

## Integration with Broader Ecosystem

### EVM Equivalence Support

The summarization system directly supports the evm-equivalence project by:
- Providing atomic opcode summaries for cross-model verification
- Enabling equivalence proofs between different EVM implementations
- Supporting optimization potential for downstream verification tools

### Verification Tool Integration

The system integrates with:
- **Kontrol**: Formal verification framework
- **KEVM**: Core K-based EVM semantics
- **PyK**: Python-based K framework
- **Booster**: High-performance execution engine

## Future Extensibility

### Modular Design

The system's modular architecture allows for:
- Easy addition of new opcodes
- Extension to new EVM versions
- Integration with other verification frameworks
- Customization for specific use cases

### Scalability Considerations

Design features supporting scalability:
- Parallel processing capabilities
- Incremental summarization
- Configurable depth and complexity limits
- Efficient KCFG management

## Conclusion

The EVM Opcode Summarization System represents a significant advancement in formal verification tooling for Ethereum. By automatically generating provably correct, single-step execution summaries for 68 EVM opcodes, it provides a foundational capability for efficient formal verification, cross-model equivalence checking, and optimization of verification workflows.

The system's comprehensive architecture, robust testing framework, and integration with the broader K ecosystem make it a valuable contribution to the formal verification community and the broader Ethereum ecosystem.

## Technical Specifications

- **Total Lines of Code**: ~943 lines (core engine)
- **Supported Opcodes**: 68 unique EVM opcodes
- **Generated Files**: 71 summary files (70 individual opcode files + 1 master summary)
- **Test Coverage**: 100% of supported opcodes with formal proofs
- **Parallel Processing**: Configurable multi-process support
- **Integration Points**: CLI, build system, testing framework
- **Performance**: Optimized for single-step execution with max depth 1 

EVM Opcode Summarization System Technical Report #2778

Description

Executive Summary

Summarization Workflow

Phase 1: Specification Building (build_spec)

1.1 Initial State Construction

1.2 Opcode-Specific Handling

1.3 Constraint Generation

Phase 2: KCFG Exploration (explore)

2.1 Execution Environment Setup

2.2 Proof Execution

2.3 Result Validation

Phase 3: Rule Summarization (summarize)

3.1 KCFG Minimization

3.2 Rule Transformation Pipeline

3.3 Rule Generation

3.4 File Generation

Workflow Integration

Entry Point Function

Batch Processing

Quality Assurance in Workflow

Validation Checks

Error Handling

Performance Optimizations

Caching Strategy

Parallel Processing

Memory Management

System Architecture Overview

Core Components

Additional Supporting Components

Technical Architecture

1. KEVMSummarizer Class

2. Opcode Coverage and Status

3. Specialized Processing Categories

4. Proof Generation and Validation

Implementation Details

Stack Underflow Prevention

Gas Cost Integration

Account State Management

Generated Outputs

Summary Files Structure

Rule Transformation Pipeline

CLI Integration

Command Structure

Build Targets

Testing and Validation

Test Categories

Validation Criteria

Performance Optimization

Parallel Processing

Configuration Optimizations

Quality Assurance

Code Standards

Error Handling

Integration with Broader Ecosystem

EVM Equivalence Support

Verification Tool Integration

Future Extensibility

Modular Design

Scalability Considerations

Conclusion

Technical Specifications

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Phase 1: Specification Building (`build_spec`)

Phase 2: KCFG Exploration (`explore`)

Phase 3: Rule Summarization (`summarize`)