feat(bloatnet): Add first multi-opcode benchmarks for Bloatnet #2090

gballet · 2025-09-01T07:22:27Z

🗒️ Description

Add CREATE2 deterministic address calculation to overcome 24KB bytecode limit
Fix While loop condition to properly iterate through contracts
Account for memory expansion costs in gas calculations
Add safety margins (50k gas reserve, 98% utilization) for stability
Tests now scale to any gas limit without bytecode constraints
Achieve 98% gas utilization with 10M and 20M gas limits

🔗 Related Issues or PRs

#1986

✅ Checklist

All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
```
uvx --with=tox-uv tox -e lint,typecheck,spellcheck,markdownlint
```
All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
All: Considered adding an entry to CHANGELOG.md.
All: Considered updating the online docs in the ./docs/ directory.
All: Set appropriate labels for the changes (only maintainers can apply labels).
Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.

Signed-off-by: Guillaume Ballet <[email protected]>

Signed-off-by: Guillaume Ballet <[email protected]> remove leftover single whitespace :|

Signed-off-by: Guillaume Ballet <[email protected]>

- Add CREATE2 deterministic address calculation to overcome 24KB bytecode limit - Fix While loop condition to properly iterate through contracts - Account for memory expansion costs in gas calculations - Add safety margins (50k gas reserve, 98% utilization) for stability - Tests now scale to any gas limit without bytecode constraints - Achieve 98% gas utilization with 10M and 20M gas limits

LouisTsai-Csie · 2025-09-02T03:59:28Z

@CPerezz Thanks for this PR, I review the last commit, for the test_bloatnet_extcodesize_balance test first. If the following suggestion works for you, I could continue review based on the approach.

It deploys many 24kB contracts with unique bytecode, then:
1. Calls BALANCE on all contracts (cold access) to warm them and fill cache
2. Calls EXTCODESIZE on all contracts (warm access) hoping cache evictions force re-reads

Based on the description, I suggest some optimization below. I compare the two versions with the command, using 10M as gas limit:

uv run fill -v tests/benchmark/test_bloatnet.py::test_bloatnet_extcodesize_balance -m benchmark --gas-benchmark-values 10 --clean -s

Based on current implementation, the num_contracts variable is 3454, which equals to the number of operation count, but with some optimization it could increase to 3682. But i am not sure if this implementation still align with the testing scenario for bloatnet, please let me know if something goes wrong here.

Your current approach is (1) create a lot of contract via CREATE2 (2) calling these contract with BALANCE and EXTCODESIZE operation. In step 2, the CREATE2 address is calculated in the attack contract, but since these address are calculated in Step 1 also, we could hardcode these value in the Step 2, thus reducing the cost per iteration.

On the other side, you mention the difficulty of consuming all the gas in the block here. You can pass this value to expected_benchmark_gas_used in blockchain_test, so it would compare the actual gas consumption during execution, and the specified value. No need to consume up to gas_benchmark_value, and no padding is needed. For more explanation, I've posed in our thread in Mattermost, please take a look.

Please let me know what do you think!

tests/benchmark/test_bloatnet.py

CPerezz · 2025-09-09T11:27:37Z

Hey @LouisTsai-Csie thanks for the comments.

In step 2, the CREATE2 address is calculated in the attack contract, but since these address are calculated in Step 1 also, we could hardcode these value in the Step 2, thus reducing the cost per iteration.

As per this, notice that there's a constraint to what you propose which is that a contract is capped at 24kB atm. Thus, we can only fit ~945 addresses in it. Therefore, unless I'm missing something, this should be the best way to do this (otherwise we can have a chain of contracts that calls each one another as a linkedlist, but it's quite complex for the benefit (arround 6%).

- Remove gas reserve and 98% utilization logic for contract calculations - Directly calculate the number of contracts based on available gas - Introduce precise expected gas usage calculations for better accuracy - Ensure tests scale effectively without unnecessary constraints

…mization - Update tests to generate unique bytecode for each contract, maximizing I/O reads during benchmarks. - Clarify comments regarding bytecode generation and its impact on gas costs. - Ensure CREATE2 addresses are calculated consistently using a base bytecode template. - Improve test descriptions to reflect the changes in contract deployment strategy.

…utility function - Remove the custom `calculate_create2_address` function in favor of the `compute_create2_address` utility. - Update tests to utilize the new utility for consistent CREATE2 address calculations. - Simplify code by eliminating unnecessary complexity in address calculation logic. - Ensure that the CREATE2 prefix is directly set to 0xFF in the memory operation for clarity.

tests/benchmark/bloatnet/test_bloatnet.py

CPerezz · 2025-09-17T21:33:16Z

tests/benchmark/bloatnet/deploy_create2_factory.py

+#!/usr/bin/env python3
+"""
+Deploy a simple CREATE2 factory for benchmark tests.
+This factory can be reused across all tests and allows deterministic addresses.
+"""


Unsure if it's ok to have this file here. I added it for completeness. But I can certainly leave this outside of this repo and in a gist or similar. Leaving instructions in the README on how to use it.

As you prefer.

CPerezz · 2025-09-17T21:34:18Z

tests/benchmark/bloatnet/deploy_bloatnet_simple.py

+#!/usr/bin/env python3
+"""
+CREATE2 deployment script for bloatnet benchmark contracts.
+Uses a factory pattern to deploy contracts at deterministic addresses.
+
+Based on the pattern from EIP-7997, this deploys contracts using CREATE2
+so they can be accessed from any account and reused across tests.
+"""


I could probably also leave this outside of this repo (inside a gist or similar). But I feel this one is much more needed to have completeness (the repo contains everything needed to use it).

Let me know if you prefer me to strip this away.

CPerezz · 2025-09-17T21:40:29Z

Left to do in this PR:

Add a label for Bloatnet benchmarks (so we can trigger as a group with execute.
Update SLOAD case from @gballet 's PR in feat(tests): add first few single-opcode test for state access in BloatNet #2040 and possibly superseed his PR.
The rest is in the comments I added.

jochem-brouwer

Left some general comments! I think to write these bytecode it would be helpful to take a look at the tooling EEST provides to make writing these bytecode easier!

In general: most of the attack scenarios (in isolated form, so not BALANCE + EXTCODESIZE/EXTCODECOPY but rather BALANCE or EXTCODESIZE or EXTCODECOPY) are written in the benchmarks folder already (I linked those).

These are from tests related to zkEVM where we want to find the scenarios which yield the most zk cycles to get an idea how "bad" these are. This is a different perspective of "worst case" than we want to investigate here. Some tests will need some slight edits.

Here is what I think we should do:

The current tests (EXTCODESIZE/EXTCODECOPY (and BALANCE but this will not change if we target accounts with or without code)) target a HUGE AMOUNT of contracts with code. What we want to do is only deploy these accounts on the bloatnet once and then write our tests in such way that they target these accounts. (Note: the "risk" of this is that these contracts are in the memory cache of the clients, which we don't want, since the worst case scenario means that client has to read from disk and not from memory cache). This is (I think) a good start.

We can therefore take these written tests, edit those, skip the pre-deploy phase and instead run the attacks directly on the bloatnet where we target these predeployed contracts. We can also repurpose those accounts for other tests, like EXTCODEHASH tests.

For tests specifically executing code (so CALL, CALLCODE, DELEGATECALL, STATICCALL) we have to also take into account the JUMPDEST analysis which will perform good on some clients for specific code, but bad on other clients which will perform better on other formatted code (this depends on what pattern they have optimized for, mostly if they store valid JUMPDEST or if they store intermediate bytes). That test is here:

execution-spec-tests/tests/benchmark/test_worst_bytecode.py

Line 247 in 291fe00

def test_worst_initcode_jumpdest_analysis(

It would be helpful to immediately deploy the contracts with one of these patterns for the EXTCODECOPY/EXTCODESIZE tests such that we can repurpose the factory-created contracts for these CALL tests. (To investigate other JUMPDEST patterns and their impact we should thus deploy a new set of contracts, but this is something we will tackle later).

Let me know what you think! 😄 👍

jochem-brouwer · 2025-09-17T21:57:49Z

scripts/test_create2.py

+print(f"Balance: {w3.eth.get_balance(test_account) / 10**18:.4f} ETH")
+
+# Simple CREATE2 factory that returns the deployed address
+factory_bytecode = (


I believe this file got accidentally committed but will still leave some comments.

Nit: this is the factory initcode as this bytecode will deploy the CREATE2 factory contract in case of a create transaction.

jochem-brouwer · 2025-09-17T21:58:44Z

scripts/test_create2.py

+    "60" + "00"  # PUSH1 0x00
+    "37"  # CALLDATACOPY (copy all calldata to memory)
+
+    "60" + "00"  # PUSH1 0x00 (salt - using 0 for simplicity)


Fixed salt means that identical calldata to this contract will always deployed at the same address, it is thus not possible to create "copies" from this contract which run the same initcode in a different address.

That's a good point! Anyways, removing this file as you pointed out. So we can ignore.

jochem-brouwer · 2025-09-17T22:00:38Z

scripts/test_create2.py

+    "60" + "00"  # PUSH1 0x00
+    "52"  # MSTORE (store address at 0)
+    "60" + "20"  # PUSH1 0x20
+    "60" + "00"  # PUSH1 0x00


PUSH1 0 could also be written as PUSH0 (0x5f) - this is cheaper but not supported on chains which do not have PUSH0. See also https://eips.ethereum.org/EIPS/eip-7997

Bloatnet focuses precisely on Ethereum's mainnet. So for now I think we're fine using PUSH0.
We can always modify if needed.

jochem-brouwer · 2025-09-17T22:03:48Z

scripts/test_create2.py

+
+    # Verify by checking code
+    code = w3.eth.get_code(deployed_addr)
+    print(f"Deployed code length: {len(code)} bytes")


This should store as code: 0x00..0042 (32 bytes)

jochem-brouwer · 2025-09-17T22:05:33Z

tests/benchmark/bloatnet/deploy_bloatnet_simple.py

+    init_code = bytearray()
+
+    # Init code: PUSH2 size, PUSH1 offset, PUSH1 dest, CODECOPY, PUSH2 size, PUSH1 0, RETURN
+    bytecode_size = 24576


You can read this constant from fork: Fork (provided in tests) and read fork.max_code_size() to read the constant from the fork.

jochem-brouwer · 2025-09-17T22:17:23Z

tests/benchmark/bloatnet/deploy_create2_factory.py

+    # Returns the deployed address
+    factory_bytecode = bytes(
+        [
+            # Runtime code


I strongly suggest to read a bit about the tooling EEST provides to write these kind of bytecode tests.

For instance, here is CREATE2 factory code:

execution-spec-tests/tests/benchmark/test_worst_bytecode.py

Lines 142 to 160 in 291fe00

factory_code = (

Op.EXTCODECOPY(

address=initcode_address,

dest_offset=0,

offset=0,

size=Op.EXTCODESIZE(initcode_address),

)

+ Op.MSTORE(

0,

Op.CREATE2(

value=0,

offset=0,

size=Op.EXTCODESIZE(initcode_address),

salt=Op.SLOAD(0),

),

)

+ Op.SSTORE(0, Op.ADD(Op.SLOAD(0), 1))

+ Op.RETURN(0, 32)

)

which will factory-deploy this initcode:

execution-spec-tests/tests/benchmark/test_worst_bytecode.py

Lines 118 to 137 in 291fe00

initcode = (

Op.MSTORE(0, Op.ADDRESS)

+ While(

body=(

Op.SHA3(Op.SUB(Op.MSIZE, 32), 32)

# Use a xor table to avoid having to call the "expensive" sha3 opcode as much

+ sum(

(Op.PUSH32[xor_value] + Op.XOR + Op.DUP1 + Op.MSIZE + Op.MSTORE)

for xor_value in XOR_TABLE

)

+ Op.POP

),

condition=Op.LT(Op.MSIZE, max_contract_size),

)

# Despite the whole contract has random bytecode, we make the first opcode be a STOP

# so CALL-like attacks return as soon as possible, while EXTCODE(HASH|SIZE) work as

# intended.

+ Op.MSTORE8(0, 0x00)

+ Op.RETURN(0, max_contract_size)

)

which itself gets called from another contract which will invoke the factory contract N times (calldata input)

execution-spec-tests/tests/benchmark/test_worst_bytecode.py

Lines 165 to 168 in 291fe00

factory_caller_code = Op.CALLDATALOAD(0) + While(

body=Op.POP(Op.CALL(address=factory_address)),

condition=Op.PUSH1(1) + Op.SWAP1 + Op.SUB + Op.DUP1 + Op.ISZERO + Op.ISZERO,

)

.

EEST will else help with things like formatting numbers for you, i.e. 0xff gets put in a PUSH1 but 0xffff will be put in a PUSH2 (so you dont have to do the bitwise logic like the shifting and the AND masks here)

I agree things would get much better. At the same time, I'm not convinced I want to add this side-scripts to EEST as they would need to take care of the maintenance which is a burden that might be unnecessary (these are only needed for the setup, not for the test execution).

Maybe @LouisTsai-Csie or @fselmo have some takes here.

To be clear, If STEEL team doesn't want to have the deployment helper files within EEST (which I could def understand), then I'd prefer avoiding the dependency on EEST for the standalone files. (As I'll publish them as a gist or similar so that they can be executed after a wget or whatever).

If they remain here, I agree the refactor makes sense. Pushed it for you all to see the diff in: cf2c7c6

jochem-brouwer · 2025-09-17T22:26:33Z

tests/benchmark/bloatnet/README.md

+- `BALANCE` (cold access): 2,600 gas
+- `POP`: 2 gas
+- `EXTCODESIZE` (warm): 100 gas
+- `POP`: 2 gas


This tests is basically two scenarios in one. It tests both BALANCE (which marks it warm) and EXTCODESIZE.

Note that for accounts in the Merkle Patricia Trie in the state, account are stored as:

[nonce, balance, storageRoot, codeHash]

Thus reading balance from MPT will "just" require reading the account. EXTCODESIZE however means we have to query codeHash, and to get the size we have to lookup all the code from the DB in order to determine the size (this assumes that the client has not optimized this some way, for instance via an extra database like a codeHash => codeSize lookup which would skip first reading all the code to determine size).

So, I believe we need scenarios for BALANCE/EXTCODESIZE.

For EXTCODESIZE, I think this benchmark test is what you want:

execution-spec-tests/tests/benchmark/test_worst_bytecode.py

Line 41 in 291fe00

Op.EXTCODESIZE,

For BALANCE (cold) this test:

execution-spec-tests/tests/benchmark/test_worst_stateful_opcodes.py

Line 39 in 291fe00

Op.BALANCE,

I'm not sure I follow you here.

If these standalone scenarios already exist as you correctly pointed out, and my PR adds the combination of them into a single test, what is actually needed further from what this PR adds?

Are you claiming we need to implement something? What I want is to test the combination of the 2 together. And observe if any client has optimizations that can be applied. This is all part of the following scenarios I want to implement for bloatnet: https://hackmd.io/9icZeLN7R0Sk5mIjKlZAHQ#Opcode-State-Access-Combination-Tests

Sorry! You are right, I was thinking of this from a different perspective (opcodes in isolation). The combined test is indeed not written.

jochem-brouwer · 2025-09-17T22:32:18Z

tests/benchmark/bloatnet/README.md

+- `BALANCE` (cold access): 2,600 gas
+- `POP`: 2 gas
+- `EXTCODECOPY` setup: ~100 gas
+- `EXTCODECOPY` (24KB): ~2,300 gas


(same comment regarding BALANCE + EXTCODECOPY here, should be in different tests so we can investigate the individual behavior of these opcodes instead of mixed)

For EXTCODECOPY, we want to force clients to read all code. But since we expand memory for the copied bytes and pay for those copied bytes also, copying 1 byte instead of the 24 kb should be sufficient (this forces clients to load code from disk since we need to know what that specific byte is). Therefore we should read the final byte of the contract.

This test is relevant:

execution-spec-tests/tests/benchmark/test_worst_bytecode.py

Line 192 in 291fe00

attack_call = Op.EXTCODECOPY(address=Op.SHA3(32 - 20 - 1, 85), dest_offset=96, size=1000)

I'm not sure why it copies 1000 bytes, but this should be edited to read 1 byte so we can target more accounts. (1000 byte copy likely from the original idea of these benchmarks (zkEVM) because we want to measure the worst case zk cycles there, not the worst state attack, so slightly different performance perspective regarding worst case scenarios there)

Note: in linked tests there is also a calculation in the pre-setup phase to calculate how much contracts are necessary. This uses an upper bound, it should be slightly less in practice, so attack block will always read non-empty accounts

The part of reading a single byte is pure gold! Thanks so much for this trick! I did not consider it but makes all the sense!

jochem-brouwer · 2025-09-17T22:35:21Z

tests/benchmark/bloatnet/test_bloatnet.py

+    )
+
+    # Post-state: just verify attack contract exists
+    # Benchmark tests run out of gas, so no success flag


For these kind of attacks OOG is fine, but if we do edit anything in state, it should not OOG because we have to ensure the writes are flushed to trie and not discarded (just a side note)

In our cases for SLOAD, EXTCODE_X for now we never write to state. So I'm getting more and more convinced of removing them basically.

WDYT?

Removing those read tests? They are definitely necessary. Although you do not write anything to state, it is not known beforehand that you do not write anything to state (besides: you actually do write something to state: namely, you deduct the gas*gasPrice from the tx sender and update the nonce, so you still need to know the gas spent on execution) so you have to execute these (disk read) operations anyways. We would still need to know the worst case behavior is also if there are only disk read operations. So I would not recommend to remove those.

This was commited unintentionally

…OPY pattern - Updated the README to reflect the optimized gas cost for the BALANCE + EXTCODECOPY pattern, reducing it from ~5,007 to ~2,710 gas per contract. - Modified the test_bloatnet_balance_extcodecopy function to read only 1 byte from the end of the bytecode, minimizing gas costs while maximizing contract targeting. - Adjusted calculations for the number of contracts needed based on the new cost per contract, ensuring accurate benchmarks.

CPerezz · 2025-09-18T09:42:38Z

@jochem-brouwer thanks a lot for taking the time to review. A couple points:

In general: most of the attack scenarios (in isolated form, so not BALANCE + EXTCODESIZE/EXTCODECOPY but rather BALANCE or EXTCODESIZE or EXTCODECOPY) are written in the benchmarks folder already (I linked those).

Thanks for pointing out! Yes, in fact I ignored the single-opcode implementations of these cases because of this exact reason. I think anyways that we will need to "fork them" or similar since we want to also explore the opcodes by themseleves but targeting already-deployed contracts (as our tests are meant to run worst-cases of 150M gas_limit (to account for 20% refund tricks).

So overall, I think this is a good next-task too. And even if it's a bit ugly to duplicate the tests, I think the avoidance of deploying all these contracts is really important. And, at the same time, I don't want to touch zkevm tests so that they have consistent behaviour agains their prior benchmarks against their specific code.
LMK what you think.

For tests specifically executing code (so CALL, CALLCODE, DELEGATECALL, STATICCALL) we have to also take into account the JUMPDEST analysis which will perform good on some clients for specific code, but bad on other clients which will perform better on other formatted code (this depends on what pattern they have optimized for, mostly if they store valid JUMPDEST or if they store intermediate bytes).

Here with the 63/64 rule will mostly kill any attempt to abuse CALL-like opcodes (at least that was my thought). Anyways, there are proposals that suggest removing it (see: https://ethereum-magicians.org/t/eip-7923-linear-page-based-memory-costing/23290/11).
Nonetheless, this is another of the tasks that needs to be done. I wonder if you have any suggestions on how to exploit the Bloatnet characteristics (big state & high gas limit) to abuse these opcodes as much as possible.

- Updated the deploy_create2_factory_refactored.py script to improve the deployment of a CREATE2 factory with an initcode template, allowing for dynamic contract address generation. - Modified test_bloatnet.py to support on-the-fly CREATE2 address generation, optimizing gas costs and improving test accuracy. - Adjusted gas cost calculations in the README to reflect the new deployment approach, ensuring accurate benchmarks for BloatNet tests.

CPerezz · 2025-09-19T08:11:36Z

The PR is ready for review @LouisTsai-Csie @marioevz

I think once you approve, I will deploy all the contracts in Bloatnet and make the final commit updating the addresses. Then we can merge.

jochem-brouwer · 2025-09-19T14:25:15Z

Here with the 63/64 rule will mostly kill any attempt to abuse CALL-like opcodes (at least that was my thought). Anyways, there are proposals that suggest removing it (see: https://ethereum-magicians.org/t/eip-7923-linear-page-based-memory-costing/23290/11).

Yes this would be the case if we keep CALLing into contracts, but this is not the attack scenario here. The attack is to call from a root contract into the children contract deployed via the CREATE2 factory (these should all have unique code). The call depth is thus max 2. In the children contract we perform a JUMP to the end of the code. This forces clients to perform JUMPDEST analysis, which means to scan (from the start of the code all to the target pc we are targeting) all the opcodes to mark invalid bytes (the intermediate bytes of PUSH) as invalid JUMPDESTs in order to verify the JUMP target is indeed a valid JUMPDEST. This would therefore clients to load (1) X unique contracts from disk and (2) perform JUMPDEST analysis on X*24 KiB contracts. Which my feeling would be relatively heavy on both I/O and on execution logic. (This test is written, linked here:

execution-spec-tests/tests/benchmark/test_worst_bytecode.py

Line 247 in 9f96fad

def test_worst_initcode_jumpdest_analysis(

)

So overall, I think this is a good next-task too. And even if it's a bit ugly to duplicate the tests, I think the avoidance of deploying all these contracts is really important. And, at the same time, I don't want to touch zkevm tests so that they have consistent behaviour agains their prior benchmarks against their specific code.
LMK what you think.

I think we could duplicate the tests but instead write them in such way, that if two tests target the same type of code deployed from the CREATE2 factory, we assume when executing these tests those child contract are already deployed and then only run the attack phase of the test. This would avoid having to deploy these large amount of contracts and, especially when benching these scenarios, thus makes it much easier to re-bench it. (Without having to wait X time before those thousands of contracts are deployed)

jochem-brouwer · 2025-09-19T14:30:56Z

tests/benchmark/bloatnet/deploy_create2_factory_refactored.py

+            condition=Op.LT(Op.MSIZE, MAX_CONTRACT_SIZE),
+        )
+        # Set first byte to STOP for efficient CALL handling
+        + Op.MSTORE8(0, 0x00)


PUSH1 1 CODESIZE SUB JUMP is a cheap way to force JUMPDEST analysis over the whole contract (24 KiB).

STOP is 0 gas, this is is 3 + 2 + 3 + 8 is 16 gas. For 16 gas you force clients to scan the 24 KiB of code. So rather cheap 😂

(Just noting this, this is not a suggestion, but something to keep in mind of combining these scenarios that thus also "cheaply" if you already are in this call frame for 16 gas you can also add this JUMPDEST analysis to the performance test)

LouisTsai-Csie

Hi @CPerezz, thanks for the latest changes. I ran your tests locally and have a few suggestions to help them pass.

Since this PR is based on PR #2040, I believe it can’t be merged until that one is merged (please correct me if I’m wrong). At the moment, we also can’t run the CI. My suggestion would be to open a new PR and migrate the tests/benchmark/bloatnet/test_bloatnet.py file there. We do not need to migrate the deploy_create2_factory_refactored.py and deploy_bloatnet_simple.py for now, as we discussed integration in a separate PR.

We can keep this PR open so I can create follow-up issues from it.

LouisTsai-Csie · 2025-09-22T09:23:10Z

tests/benchmark/bloatnet/test_bloatnet.py

+    Transaction,
+    While,
+)
+from ethereum_test_tools.vm.opcode import Opcodes as Op


There are some folder restructuring in EEST, please rebase first and the path could be updated as follows to avoid conflict.

Suggested change

from ethereum_test_tools.vm.opcode import Opcodes as Op

from ethereum_test_vm.opcodes import Opcodes as Op

LouisTsai-Csie · 2025-09-22T09:23:53Z

tests/benchmark/bloatnet/test_bloatnet.py

+    num_contracts = min(contracts_needed, NUM_DEPLOYED_CONTRACTS)
+
+    if contracts_needed > NUM_DEPLOYED_CONTRACTS:
+        import warnings


nit: could we import at the beginning of the file.

LouisTsai-Csie · 2025-09-22T09:25:55Z

tests/benchmark/bloatnet/test_bloatnet.py

+    blockchain_test(
+        pre=pre,
+        blocks=[Block(txs=[attack_tx])],
+        post=post,
+    )


The CI will fail sine the gas usage is mismatched, but we already resolve the issue by PR #2155:

Suggested change

blockchain_test(

pre=pre,

blocks=[Block(txs=[attack_tx])],

post=post,

)

blockchain_test(

pre=pre,

blocks=[Block(txs=[attack_tx])],

post=post,

skip_gas_used_validation=True,

)

LouisTsai-Csie · 2025-09-22T09:26:45Z

tests/benchmark/bloatnet/test_bloatnet.py

+    blockchain_test(
+        pre=pre,
+        blocks=[Block(txs=[attack_tx])],
+        post=post,
+    )


Suggested change

blockchain_test(

pre=pre,

blocks=[Block(txs=[attack_tx])],

post=post,

)

blockchain_test(

pre=pre,

blocks=[Block(txs=[attack_tx])],

post=post,

skip_gas_used_validation=True,

)

CPerezz · 2025-10-01T13:46:57Z

I belive this can be closed @gballet as is superset by #2186

LouisTsai-Csie · 2025-10-03T13:06:19Z

Closes as PR #2186 and PR #2242 completes. Feel free to create issues if I miss any todo in this PR. Thanks everyone for the contribution, review and feedback.

gballet and others added 21 commits August 14, 2025 13:00

Add BloatNet tests

a1f2153

Signed-off-by: Guillaume Ballet <[email protected]>

try building the contract

02d65b4

Signed-off-by: Guillaume Ballet <[email protected]>

fix: SSTORE 0 -> 1 match all values in the state

e721cc6

Signed-off-by: Guillaume Ballet <[email protected]>

add the tx for 0 -> 1 and 1 -> 2

d1cad25

Signed-off-by: Guillaume Ballet <[email protected]>

fix: linter issues

16f6d30

Signed-off-by: Guillaume Ballet <[email protected]>

remove more whitespaces

374e08a

Signed-off-by: Guillaume Ballet <[email protected]> remove leftover single whitespace :|

fix formatting

333c876

move to benchmarks

79a95b8

Signed-off-by: Guillaume Ballet <[email protected]>

fix linter value

8131e98

use the gas limit from the environment

5f805fd

parameterize the written value in SSTORE

090a400

fix linter issues

cd02a02

update CHANGELOG.md

1f3c381

fix format

f6def7e

simplify syntax

7e20a50

fix: start with an empty contract storage

c24ad35

more fixes, but the result is still incorrect

fc27e53

fix: finally fix the tests

7d87262

linter fix

8556014

add SLOAD tests

326915e

LouisTsai-Csie requested changes Sep 2, 2025

View reviewed changes

LouisTsai-Csie reviewed Sep 2, 2025

View reviewed changes

tests/benchmark/test_bloatnet.py Outdated Show resolved Hide resolved

LouisTsai-Csie mentioned this pull request Sep 8, 2025

feat(tests): add first few single-opcode test for state access in BloatNet #2040

Open

8 tasks

CPerezz added 3 commits September 11, 2025 12:09

LouisTsai-Csie marked this pull request as ready for review September 12, 2025 09:38

CPerezz reviewed Sep 17, 2025

View reviewed changes

tests/benchmark/bloatnet/test_bloatnet.py Outdated Show resolved Hide resolved

CPerezz reviewed Sep 17, 2025

View reviewed changes

tests/benchmark/bloatnet/test_bloatnet.py Outdated Show resolved Hide resolved

CPerezz reviewed Sep 17, 2025

View reviewed changes

tests/benchmark/bloatnet/test_bloatnet.py Outdated Show resolved Hide resolved

CPerezz reviewed Sep 17, 2025

View reviewed changes

tests/benchmark/bloatnet/test_bloatnet.py Show resolved Hide resolved

CPerezz reviewed Sep 17, 2025

View reviewed changes

jochem-brouwer reviewed Sep 17, 2025

View reviewed changes

CPerezz added 6 commits September 18, 2025 10:19

delete: remove obsolete test_create2.py script

2875cf4

This was commited unintentionally

refactor(benchmark): support non-fixed max_codesize

774c56c

chore: Remove all 24kB "hardcoded" refs

6e6863a

fix: pre-commit lint hooks

f2cd5f9

push updated deploy_create2_factory refactored with EEST as dep

cf2c7c6

gballet changed the title ~~test(benchmark): implement CREATE2 addressing for bloatnet tests~~ feat(bloatnet): Add first multi-opcode benchmarks for Bloatnet Sep 19, 2025

remove: old_deploy_factory script

55396fb

CPerezz force-pushed the feat/multi-opcode-bloatnet-bench branch from 67d52ea to 55396fb Compare September 19, 2025 08:57

jochem-brouwer reviewed Sep 19, 2025

View reviewed changes

LouisTsai-Csie mentioned this pull request Sep 22, 2025

feat: add bloatnet marker support #2169

Merged

8 tasks

LouisTsai-Csie requested changes Sep 22, 2025

View reviewed changes

LouisTsai-Csie mentioned this pull request Sep 22, 2025

refactor: update bloatnet benchmark cases #2185

Open

CPerezz mentioned this pull request Sep 22, 2025

feat(tests): multi opcode bloatnet ext cases #2186

Merged

8 tasks

CPerezz mentioned this pull request Oct 1, 2025

feat(benchmark): Add reversed bloatnet multi-opcode benchmarks with BALANCE-EXTCODE_ variants #2242

Merged

8 tasks

LouisTsai-Csie closed this Oct 3, 2025

	factory_code = (
	Op.EXTCODECOPY(
	address=initcode_address,
	dest_offset=0,
	offset=0,
	size=Op.EXTCODESIZE(initcode_address),
	)
	+ Op.MSTORE(
	0,
	Op.CREATE2(
	value=0,
	offset=0,
	size=Op.EXTCODESIZE(initcode_address),
	salt=Op.SLOAD(0),
	),
	)
	+ Op.SSTORE(0, Op.ADD(Op.SLOAD(0), 1))
	+ Op.RETURN(0, 32)
	)

	initcode = (
	Op.MSTORE(0, Op.ADDRESS)
	+ While(
	body=(
	Op.SHA3(Op.SUB(Op.MSIZE, 32), 32)
	# Use a xor table to avoid having to call the "expensive" sha3 opcode as much
	+ sum(
	(Op.PUSH32[xor_value] + Op.XOR + Op.DUP1 + Op.MSIZE + Op.MSTORE)
	for xor_value in XOR_TABLE
	)
	+ Op.POP
	),
	condition=Op.LT(Op.MSIZE, max_contract_size),
	)
	# Despite the whole contract has random bytecode, we make the first opcode be a STOP
	# so CALL-like attacks return as soon as possible, while EXTCODE(HASH\|SIZE) work as
	# intended.
	+ Op.MSTORE8(0, 0x00)
	+ Op.RETURN(0, max_contract_size)
	)

	factory_caller_code = Op.CALLDATALOAD(0) + While(
	body=Op.POP(Op.CALL(address=factory_address)),
	condition=Op.PUSH1(1) + Op.SWAP1 + Op.SUB + Op.DUP1 + Op.ISZERO + Op.ISZERO,
	)

	from ethereum_test_tools.vm.opcode import Opcodes as Op
	from ethereum_test_vm.opcodes import Opcodes as Op

feat(bloatnet): Add first multi-opcode benchmarks for Bloatnet #2090

feat(bloatnet): Add first multi-opcode benchmarks for Bloatnet #2090

Uh oh!

Conversation

gballet commented Sep 1, 2025

🗒️ Description

🔗 Related Issues or PRs

✅ Checklist

Uh oh!

LouisTsai-Csie commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CPerezz commented Sep 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CPerezz commented Sep 17, 2025

Uh oh!

jochem-brouwer left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LouisTsai-Csie commented Sep 2, 2025 •

edited

Loading