Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
a1f2153
Add BloatNet tests
gballet Aug 14, 2025
02d65b4
try building the contract
gballet Aug 14, 2025
e721cc6
fix: SSTORE 0 -> 1 match all values in the state
gballet Aug 14, 2025
d1cad25
add the tx for 0 -> 1 and 1 -> 2
gballet Aug 14, 2025
16f6d30
fix: linter issues
gballet Aug 14, 2025
374e08a
remove more whitespaces
gballet Aug 14, 2025
333c876
fix formatting
gballet Aug 15, 2025
79a95b8
move to benchmarks
gballet Aug 21, 2025
8131e98
fix linter value
gballet Aug 22, 2025
5f805fd
use the gas limit from the environment
gballet Aug 22, 2025
090a400
parameterize the written value in SSTORE
gballet Aug 26, 2025
cd02a02
fix linter issues
gballet Aug 26, 2025
1f3c381
update CHANGELOG.md
gballet Aug 26, 2025
f6def7e
fix format
gballet Aug 26, 2025
7e20a50
simplify syntax
gballet Aug 26, 2025
c24ad35
fix: start with an empty contract storage
gballet Aug 26, 2025
fc27e53
more fixes, but the result is still incorrect
gballet Aug 26, 2025
7d87262
fix: finally fix the tests
gballet Aug 26, 2025
8556014
linter fix
gballet Aug 27, 2025
326915e
add SLOAD tests
gballet Aug 27, 2025
1f8e62a
test(benchmark): implement CREATE2 addressing for bloatnet tests
CPerezz Aug 29, 2025
8babb13
refactor(benchmark): optimize gas calculations in bloatnet tests
CPerezz Sep 11, 2025
e70132b
refactor(benchmark): bloatnet tests with unique bytecode for I/O opt…
CPerezz Sep 11, 2025
0e889d7
refactor(benchmark): replace custom CREATE2 address calculation with …
CPerezz Sep 11, 2025
e4583b6
CREATE2 factory approach working
CPerezz Sep 17, 2025
06f9a63
Version with EIP-7997 model working
CPerezz Sep 17, 2025
49c1343
refactor(benchmark): imrpove contract deployment script with interact…
CPerezz Sep 17, 2025
2875cf4
delete: remove obsolete test_create2.py script
CPerezz Sep 18, 2025
b634ca3
refactor(benchmark): optimize gas calculations for BALANCE + EXTCODEC…
CPerezz Sep 18, 2025
774c56c
refactor(benchmark): support non-fixed max_codesize
CPerezz Sep 18, 2025
6e6863a
chore: Remove all 24kB "hardcoded" refs
CPerezz Sep 18, 2025
f2cd5f9
fix: pre-commit lint hooks
CPerezz Sep 18, 2025
cf2c7c6
push updated deploy_create2_factory refactored with EEST as dep
CPerezz Sep 18, 2025
a862f76
refactor(benchmark): enhance CREATE2 factory deployment and testing
CPerezz Sep 19, 2025
55396fb
remove: old_deploy_factory script
CPerezz Sep 19, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ Users can select any of the artifacts depending on their testing needs for their

### 🧪 Test Cases

- ✨ [BloatNet](bloatnet.info)/Multidimensional Metering: Add benchmarks to be used as part of the BloatNet project and also for Multidimensional Metering.
- ✨ [EIP-7951](https://eips.ethereum.org/EIPS/eip-7951): Add additional test cases for modular comparison.
- 🔀 Refactored `BLOBHASH` opcode context tests to use the `pre_alloc` plugin in order to avoid contract and EOA address collisions ([#1637](https://github.com/ethereum/execution-spec-tests/pull/1637)).
- 🔀 Refactored `SELFDESTRUCT` opcode collision tests to use the `pre_alloc` plugin in order to avoid contract and EOA address collisions ([#1643](https://github.com/ethereum/execution-spec-tests/pull/1643)).
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ exclude = [
'^fixtures/',
'^logs/',
'^site/',
'^tests/benchmark/bloatnet/deploy_.*\.py$',
]
plugins = ["pydantic.mypy"]

Expand Down
144 changes: 144 additions & 0 deletions tests/benchmark/bloatnet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# BloatNet Benchmark Tests setup guide

## Overview

This README pretends to be a guide for any user that wants to run the bloatnet test/benchmark suite in any network.
BloatNet bench cases can be seen in: https://hackmd.io/9icZeLN7R0Sk5mIjKlZAHQ.
The idea of all these tests is to stress client implementations to find out where the limits of processing are focusing specifically on state-related operations.

In this document you will find a guide that will help you deploy all the setup contracts required by the benchmarks in `/benchmarks/bloatnet`.

## Gas Cost Constants

### BALANCE + EXTCODESIZE Pattern
**Gas per contract: ~2,772**
- `SHA3` (CREATE2 address generation): 30 gas (static) + 18 gas (dynamic for 85 bytes)
- `BALANCE` (cold access): 2,600 gas
- `POP`: 2 gas
- `EXTCODESIZE` (warm): 100 gas
- `POP`: 2 gas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tests is basically two scenarios in one. It tests both BALANCE (which marks it warm) and EXTCODESIZE.

Note that for accounts in the Merkle Patricia Trie in the state, account are stored as:

[nonce, balance, storageRoot, codeHash]

Thus reading balance from MPT will "just" require reading the account. EXTCODESIZE however means we have to query codeHash, and to get the size we have to lookup all the code from the DB in order to determine the size (this assumes that the client has not optimized this some way, for instance via an extra database like a codeHash => codeSize lookup which would skip first reading all the code to determine size).

So, I believe we need scenarios for BALANCE/EXTCODESIZE.

For EXTCODESIZE, I think this benchmark test is what you want:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For BALANCE (cold) this test:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow you here.

If these standalone scenarios already exist as you correctly pointed out, and my PR adds the combination of them into a single test, what is actually needed further from what this PR adds?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you claiming we need to implement something? What I want is to test the combination of the 2 together. And observe if any client has optimizations that can be applied. This is all part of the following scenarios I want to implement for bloatnet: https://hackmd.io/9icZeLN7R0Sk5mIjKlZAHQ#Opcode-State-Access-Combination-Tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry! You are right, I was thinking of this from a different perspective (opcodes in isolation). The combined test is indeed not written.

- Memory operations and loop overhead: ~20 gas

### BALANCE + EXTCODECOPY(single-byte) Pattern
**Gas per contract: ~2,775**
- `SHA3` (CREATE2 address generation): 30 gas (static) + 18 gas (dynamic for 85 bytes)
- `BALANCE` (cold access): 2,600 gas
- `POP`: 2 gas
- `EXTCODECOPY` (warm, 1 byte): 100 gas (base) + 3 gas (copy 1 byte)
- Memory operations: 4 gas
- Loop overhead: ~20 gas

Note: Reading just 1 byte (specifically the last byte at offset 24575) forces the client
to load the entire 24KB contract from disk while minimizing gas cost. This allows
targeting nearly as many contracts as the EXTCODESIZE pattern while forcing maximum I/O.

## Required Contracts Calculation Example:

### For BALANCE + EXTCODESIZE:
| Gas Limit | Contracts Needed | Calculation |
| --------- | ---------------- | ------------------- |
| 1M | 352 | 1,000,000 ÷ 2,772 |
| 5M | 1,769 | 5,000,000 ÷ 2,772 |
| 50M | 17,690 | 50,000,000 ÷ 2,772 |
| 150M | 53,071 | 150,000,000 ÷ 2,772 |

### For BALANCE + EXTCODECOPY:
| Gas Limit | Contracts Needed | Calculation |
| --------- | ---------------- | ------------------- |
| 1M | 352 | 1,000,000 ÷ 2,775 |
| 5M | 1,768 | 5,000,000 ÷ 2,775 |
| 50M | 17,684 | 50,000,000 ÷ 2,775 |
| 150M | 53,053 | 150,000,000 ÷ 2,775 |

The CREATE2 address generation adds ~48 gas per contract but eliminates memory limitations in test framework.

## Quick Start: 150M Gas Attack

### 1. Deploy CREATE2 Factory with Initcode Template

```bash
# Deploy the factory and initcode template (one-time setup)
python3 tests/benchmark/bloatnet/deploy_create2_factory_refactored.py

# Output will show:
# Factory deployed at: 0x... <-- Save this address
# Init code hash: 0x... <-- Save this hash
```

### 2. Deploy Contracts

Deploy contracts using the factory. Each contract will be unique due to ADDRESS-based randomness in the initcode.

#### Calculate Contracts Needed

Before running the deployment, calculate the number of contracts needed:
- For 150M gas BALANCE+EXTCODESIZE: 53,071 contracts
- For 150M gas BALANCE+EXTCODECOPY: 53,053 contracts

_Deploy enough contracts to cover the max gas you plan to use in your tests/benchmarks._

#### Running the Deployment

```bash
# Deploy contracts for 150M gas attack
python3 tests/benchmark/bloatnet/deploy_create2_factory_refactored.py \
--deploy-contracts 53100

# For smaller tests (e.g., 1M gas)
python3 tests/benchmark/bloatnet/deploy_create2_factory_refactored.py \
--deploy-contracts 370
```

#### Deployment Output

After successful deployment, the script will display:

```
✅ Successfully deployed 53100 contracts
NUM_DEPLOYED_CONTRACTS = 53100
```

### 3. Update Test Configuration

Edit `tests/benchmark/bloatnet/test_bloatnet.py` and update with values from deployment:

```python
FACTORY_ADDRESS = Address("0x...") # From step 1 output
INIT_CODE_HASH = bytes.fromhex("...") # From step 1 output
NUM_DEPLOYED_CONTRACTS = 53100 # From step 2 output
```

### 4. Run Benchmark Tests

#### Generate Test Fixtures
```bash
# Run with specific gas values (in millions)
uv run fill --fork=Prague --gas-benchmark-values=150 \
tests/benchmark/bloatnet/test_bloatnet.py --clean

# Multiple gas values
uv run fill --fork=Prague --gas-benchmark-values=1,5,50,150 \
tests/benchmark/bloatnet/test_bloatnet.py
```

#### Execute Against Live Client
```bash
# Start a test node (e.g., Geth)
geth --dev --http --http.api eth,web3,net,debug

# Run tests
uv run execute remote --rpc-endpoint http://127.0.0.1:8545 \
--rpc-chain-id 1337 --rpc-seed-key 0x0000000000000000000000000000000000000000000000000000000000000001 \
tests/benchmark/bloatnet/test_bloatnet.py \
--fork=Prague --gas-benchmark-values=150 -v
```

#### With EVM Traces for Analysis
```bash
uv run fill --fork=Prague --gas-benchmark-values=150 \
--evm-dump-dir=traces/ --traces \
tests/benchmark/bloatnet/test_bloatnet.py

# Analyze opcodes executed
jq -r '.opName' traces/**/*.jsonl | sort | uniq -c
```
Loading