Skip to content

Conversation

@marioevz
Copy link
Member

@marioevz marioevz commented Nov 28, 2025

🗒️ Description

Introduces many enhancements to the execute remote and execute hive by porting ethereum/execution-spec-tests#1566 to this repo and other fixes.

Enhancements

Deferred EOA Funding

Changes the behavior of pre.fund_eoa during execute to, instead of immediately sending a the transaction to fund the given EOA with a fixed amount of Eth in all tests, save the incomplete funding transaction into a list, and then process the rest of the transactions included in the test that have the given EOA as sender, then calculate the exact amount of Eth that is required to send all of them given current network gas prices, and finally fund the EOA before submitting the test transactions.

This behavior only kicks in if pre.fund_eoa is not called with the amount parameter explicitly set. I.e. only pre.fund_eoa() or pre.fund_eoa(amount=None) use this deferred funding method.

This results in:

  • EOAs not being funded with excessive or insufficient Eth to execute the test transactions, rather they are always funded with the exact required amount to send all the transactions.
  • Minimizing the amount of Eth spent sending the test transactions due to the gas price being fetched from the network and updated dynamically.
  • More deterministic wait times for inclusion of the transactions since we fetch the gas prices from the network.

Dynamic Gas Prices

Execute command now dynamically fetches gas prices before the execution of each test, with the information becoming stale every 12 seconds.

The default gas prices (gas_price, max_priority_fee_per_gas, max_fee_per_gas, max_fee_per_blob_gas) are now automatically set to 1.5x the current network prices.

The 1.5x multiplier is currently constant, but if we deem necessary we could introduce another flag to change this.

To achieve this, we still set a number to the default gas price (7) during transaction instantiation, but we remove gas_price, max_priority_fee_per_gas, max_fee_per_gas, max_fee_per_blob_gas from model_fields_set if the user did not explicitly set them. Then execute peeks into model_fields_set and if the gas prices were not explicitly set, it updates them with the current network prices for all test transactions.

Dry-run Mode

This mode is enabled by flag --dry-run during execute remote in order to estimate how much Eth is going to be consumed when running these tests given the current network gas prices.

It does not send any transaction to the network, rather it just collects all transactions it would send and adds up the gas used and the gas prices.

Better Nonce Refresh for Worker Senders

This PR also removes _refresh_sender_nonce from the execute's Alloc class and moves this behavior to its own fixture. We now separate the worker_key (Prev sender_key) into two different fixtures:

  • session_worker_key: Determines the key that will fund all contracts and EOAs for the current pytest worker. This fixture is "session" scoped and also cleans up the worker key by sending the funds back to the seed account.
  • worker_key: This fixture takes the key from session_worker_key and updates its nonce from the current value from the network, then it's passed directly to the pre so no update of the nonce has to be performed there.

loadscope -> load for execute remote and execute hive

This is a minor change with the noticeable improvement when running tests from a single python file.

Using loadscope is convenient to fill because the module's fixtures are normally a bottleneck to test generation, and this way all the tests from the same file are assigned to the same pytest worker.

This is not the case for execute since the worst bottleneck is transaction inclusion in the chain. Now tests are assigned to whichever worker is available so it can run tests while the rest of the workers are waiting for their transactions to be included, even for tests in the same file.

Example running command uv run execute remote --fork=Osaka ./tests/frontier/opcodes/test_dup.py --rpc-seed-key $RPC_SEED_KEY --rpc-endpoint $RPC_ENDPOINT --chain-id $RPC_CHAIN_ID -n 8 with a Kurtosis local network:

  • loadscope: 16 passed, 9 warnings in 602.58s (0:10:02)
  • load: 16 passed, 9 warnings in 97.46s (0:01:37)

Note: test_dup.py is currently broken in forks/osaka because it sets the nonce and the gas_price, so the command will definitely fail when trying to reproduce this command.

Increased Logging

logger is now used more frequently in sender.py, pre_alloc.py and execute.py, to print information about the transaction execution on chain during test execution, and also more verbose errors.

Blob Tests In Execute

All blob_test tests can now be run with execute, by skipping the getBlobsVX check which requires the Engine API.

This is a minor improvement mainly targeted to sending blob transactions on live networks, rather than attempting to test the engine_getBlobsVX endpoints.

The blob gas prices are also fetched from the network live during execution of the tests.

⚠️ Breaking Changes ⚠️

  1. Flag --eoa-fund-amount-default is removed.
  2. Flag --sender-key-initial-balance in execute hive is now --seed-key-initial-balance
  3. Flags --default-gas-price, --default-max-fee-per-gas and --default-max-priority-fee-per-gas now default to None and ideally should be omitted because, when unset, the command now defaults to fetch the value from the network, which is a more reliable behavior.

🔗 Related Issues or PRs

N/A.

✅ Checklist

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    uvx tox -e static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered adding an entry to CHANGELOG.md.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).
  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
  • Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
  • Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

@codecov
Copy link

codecov bot commented Nov 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.31%. Comparing base (3c58e8d) to head (45f21e4).
⚠️ Report is 2 commits behind head on forks/osaka.

Additional details and impacted files
@@             Coverage Diff              @@
##           forks/osaka    #1822   +/-   ##
============================================
  Coverage        87.31%   87.31%           
============================================
  Files              541      541           
  Lines            32832    32832           
  Branches          3015     3015           
============================================
  Hits             28668    28668           
  Misses            3557     3557           
  Partials           607      607           
Flag Coverage Δ
unittests 87.31% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@spencer-tb spencer-tb self-requested a review November 28, 2025 19:42
deploy_gas_limit = min(deploy_gas_limit * 2, 30_000_000)
print(f"Deploying contract with gas limit: {deploy_gas_limit}")

self._refresh_sender_nonce()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this logic leads to issues for stateful benchmark testing and may also affect consensus testing.

We first construct a transaction (e.g., a contract deployment). This nonce assignment happens in the post-model initialization phase of the Pydantic model:

if "nonce" not in self.model_fields_set and self.sender is not None:
    self.nonce = HexNumber(self.sender.get_nonce())

The get_nonce function assumes that the nonce always increases sequentially. However, this assumption breaks in several scenarios:

  1. If two tests run sequentially, the framework assigns nonces 1 and 2. If the first transaction fails, the second transaction will remain pending indefinitely because its nonce will never become valid.
  2. If multiple workers submit transactions concurrently and accidentally reuse the same nonce, the duplicate nonce will invalidate one or more of those transactions.

Point 1 is more critical, because we currently cannot ensure that all benchmark tests will run successfully. However, we still want the test suite to continue running without any stop.

This concern was previously raised and solved in PR from @kamilchodola , but removing the logic here would reintroduce the same problems and may also impact his tooling.

One potential solution here is to update the nonce value based on the network's state:

if "nonce" not in self.model_fields_set and self.sender is not None:
    self.nonce = HexNumber(elf._eth_rpc.get_transaction_count(self._sender, block_number="pending"))

I've tried this approach before, but i encounter a circular import issue back at that time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, agreed, (1) can be addressed by updating the nonce just a single time before running the test function so the nonce is up to date when the test starts requesting contracts and EOAs.

Number (2) I don't see how can it happen since all workers use different senders, otherwise concurrency would fail instantly.

Will fix (1), thanks for pointing out.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change! I've reviewed again and this should be resolved.

@marioevz marioevz changed the title feat(execute): Deferred EOA funding feat(execute): Many Improvements to execute remote command Dec 1, 2025
@marioevz marioevz marked this pull request as ready for review December 1, 2025 20:30
@LouisTsai-Csie
Copy link
Collaborator

Thanks for the enhancement! Could we also update the related docs so I can provide a reference to Kamil and Jochem?

@marioevz marioevz force-pushed the deferred-eoa-funding branch from ac90263 to 83d8717 Compare December 2, 2025 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants