Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update EIP-6110: Spec parse_deposit_data #9460

Merged
merged 13 commits into from
Mar 27, 2025

Conversation

jochem-brouwer
Copy link
Member

@jochem-brouwer jochem-brouwer commented Mar 6, 2025

EIP-6110 currently does not specify how to parse the deposit event data. This PR adds the specific layout of the mainnet (and Sepolia) log data to the spec. If the deposit contract logs a DepositEvent but the layout is incorrect, the block will be marked as invalid, because the deposit contract does not adhere the deposit event layout.

Note that this approach does only spec the layout of the log: the actual bytecode of the contract is not specified. Therefore, mainnet, sepolia and holesky all match this spec. (Mainnet/Holesky bytecode is the same: Sepolia is however a "permissioned" ERC20-token contract (instead of accepting native Eth deposits) and thus has different bytecode. This thus allows any bytecode contract to be valid, given that the DepositEvent as being LOG-ed having the layout per this PR)

@github-actions github-actions bot added c-update Modifies an existing proposal t-core labels Mar 6, 2025
@eth-bot
Copy link
Collaborator

eth-bot commented Mar 6, 2025

✅ All reviewers have approved.

@eth-bot eth-bot added the a-review Waiting on author to review label Mar 6, 2025
@eth-bot eth-bot changed the title EIP-6110: Spec-out parse_deposit_data Update EIP-6110: Spec-out parse_deposit_data Mar 6, 2025
@jochem-brouwer jochem-brouwer changed the title Update EIP-6110: Spec-out parse_deposit_data Update EIP-6110: Spec parse_deposit_data Mar 6, 2025
EIPS/eip-6110.md Outdated
Parses deposit data from DepositContract.DepositEvent data.
"""

# TODO: verify byte length of deposit_event_data? Should be 576. Should this be part of the spec? (Yes?)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the failure mode for this check? Maybe we should do it here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added this also! 4f912ac 😄 👍

EIPS/eip-6110.md Outdated
# which represents the dynamic size of the encoded bytes event data
# The size itself is encoded in a 32-byte, so skip over this as well
# to arrive at the first actual data offset to be read (pubkey)
currentOffset = 32 * 5 + 32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have PUBKEY_OFFSET, WITHDRAWAL_CREDENTIALS_OFFSET, etc instead of these computations? and then:

pubkey = deposit_event_data[PUBKEY_OFFSET:PUBKEY_OFFSET + 48]
withdrawal_credentials = deposit_event_data[WITHDRAWAL_CREDENTIALS_OFFSET: WITHDRAWAL_CREDENTIALS_OFFSET + 32]
...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have added! Please verify if my offsets and sizes are correct 😄 👍

EIPS/eip-6110.md Outdated
pubkey = deposit_event_data[currentOffset:currentOffset + 48]
# The ABI data is encoded in 32-byte chunks, so the 64 bytes are reserved for the data
# The next 32 bytes are reserved for the size of the next item (withdrawals) so skip over this as well
currentOffset += 48 + 16 + 32
Copy link
Contributor

@benaadams benaadams Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an ABI isn't particularly important; what is important is the data fields are length prefixed byte arrays. The fixed offsets are a quirk of the current contracts.

Properly parsing (using length prefix) means it would still work if a contract on a different chain output the same data but trimmed leading zeros; however using fixed offsets would break (if the argument is not to use ABI because "it might change"; changing would first break the offsets) (edited)
[3:06 PM]
The data is literally saying the lengths of the data elements; seem irresponsible to ignore it and lay a presumed offset format on it? If it was C that's how you'd get buffer over/under runs.

What if the deposit contract was written in viper or some other language that decided to encode packed? Is valid by the event log specification, would fail to be parsed.

It's also both literally encoding ABI and also ignoring it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point regarding the ABI, I should not have added this as comment at all. Have removed :)

The goal this PR wants to solve is that parse_deposit_data is currently not specced. This PR should spec it. We have two approaches (and we should decide async or via ACD-E):

  • Hardcoded offsets/sizes (this PR)
  • Use ABI decoding

Note that the current EIP spec has this in it:

#### Deposit request

The structure denoting the new deposit request consists of the following fields:

1. `pubkey: Bytes48`
2. `withdrawal_credentials: Bytes32`
3. `amount: uint64`
4. `signature: Bytes96`
5. `index: uint64`

This is how the EIP-7685 request_data is being encoded. (This would also raise the question for me: why does the Deposit event not encode this directly in the event signature? But this is a side-question, we can't change this anyways, and neither can we change that on mainnet these bytes are always at the same offset and same size for all event data of the deposit event)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the OP of this PR also regarding this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this EIP is encoding that anything decoding using the specified ABI should fail if junky data was passed but the offset would "pass" which seems sloppy? Since the data would be wrong for anyone parsing the event upstream (and would the data even be correct in that circumstance?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure if I got this comment right but I think this applies here as well: #9460 (comment)

EIPS/eip-6110.md Outdated
@@ -97,7 +112,7 @@ def get_deposit_request_data(receipts)
deposit_requests = []
for receipt in receipts:
for log in receipt.logs:
is_deposit_event = (len(log.topics) > 0 and log.topics[0] == DEPOSIT_EVENT_SIGNATURE_HASH)
is_deposit_event = (len(log.topics) > 0 and log.topics[0] == DEPOSIT_EVENT_SIGNATURE_HASH and len(log.data) === 576)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when its the correct contract, correct signature but log.data is not exactly 576 bytes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It skips the log. Note that the current spec tells us how to encode the request (no changes to it in this PR):

1. `pubkey: Bytes48`
2. `withdrawal_credentials: Bytes32`
3. `amount: uint64`
4. `signature: Bytes96`
5. `index: uint64`

Note that if these value lengths are ABI encoded you would still end up with the 576 length event, because:

  • 5 offset Bytes32 (160 bytes total)
  • 5 size Bytes32 (160 bytes total)
  • pubkey: 48 bytes, but will get rounded up to 64 bytes
  • withdrawal_credentials: 32 bytes
  • amount: will get rounded up to 32 bytes
  • signature: 96 bytes
  • index: will get rounded up to 32 bytes

So, in total (5+5) * 32 + 64 + 32 + 32 + 96 + 32 = 576 bytes. If we would take, for instance, the pubkey and would now encode the actual bytes data as more than 64 bytes, then we would thus get extra 32 bytes extra in the log data, however, this cannot be encoded in the request as Bytes48 anymore (unless we want to ignore any data).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the current spec tells us how to encode the request

The spec does not specify lengths and offsets, it specifies that the input values are variable https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/deposit-contract.md#staking-deposit-contract

The deposit contract has a public deposit function to make deposits. It takes as arguments bytes calldata pubkey, bytes calldata withdrawal_credentials, bytes calldata signature, bytes32 deposit_data_root

and also that the logs are variable

Every deposit emits a DepositEvent log for consumption by the beacon chain. The deposit contract does little validation, pushing most of the validator onboarding logic to the beacon chain.

The L1 smart contract also specify that the types are variable https://etherscan.io/address/0x00000000219ab540356cbb839cbe05303d7705fa#code
Both in the input values

    function deposit(
        bytes calldata pubkey,
        bytes calldata withdrawal_credentials,
        bytes calldata signature,
        bytes32 deposit_data_root
    ) override external payable {

and in the log

event DepositEvent(
        bytes pubkey,
        bytes withdrawal_credentials,
        bytes amount,
        bytes signature,
        bytes index
    );

Where is it specified that they are fixed sized buffers of Bytes48, Bytes32, uint64, Bytes96, uint64?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I should have been more clear, I was not referring to the solidity spec, but to the encoding spec of the actual request (so the request which gets in the block header) of this EIP. See: https://eips.ethereum.org/EIPS/eip-6110#execution-layer and then to the "Deposit request" part:

image

Copy link
Contributor

@benaadams benaadams Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok but those are post parsing the log data? i.e. parse_deposit_data and is about ho to convert to fixed sizes (for output)

Before that where is the len(log.data) === 576 guarantee in any prior spec?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There technically is none, but, the deposit event hash is now hardcoded in clients to filter out non-DepositEvent logs (so the Sepolia bug)

So this is given:

      event DepositEvent(
        bytes pubkey,
        bytes withdrawal_credentials,
        bytes amount,
        bytes signature,
        bytes index
    );

Note: keccak256("DepositEvent(bytes,bytes,bytes,bytes,bytes)="0x649bbc62d0e31342afea4e5cd82d4049e7e1ee912fc0889aa790803be39038c5"

By ABI encoding as mentioned here: #9460 (comment) if we keep the way how requests are encoded, then if more data would get added to one of these parameters, it would not fit in the requests data (and we thus need to edit how the request itself gets encoded in the header)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side-note, this extra check is taken from Geth's code: https://github.com/ethereum/go-ethereum/blob/cd78b65cdaa9e00866962e38685c4f93ac9e8444/core/types/deposit.go#L28

Note that on mainnet the deposit event length should be 576 bytes always (implied by the solidity implementation, did not check bytecode)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keccak256("DepositEvent(bytes,bytes,bytes,bytes,bytes)="0x649bbc62d0e31342afea4e5cd82d4049e7e1ee912fc0889aa790803be39038c5"

Yes and bytes are variable length values; not fixed length. So the ABI does not have fixed position offsets and lengths.

The L1 code does have require statements specifying the length of the params; however we know from 3 other chains (Sepolia, Chaido and Gnosis) that the internal L1 contract implementation is not a guarenetted spec

Copy link
Contributor

@benaadams benaadams Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. if the contract accepted 1ETH deposits 0xDE0B6B3A7640000 and didn't force it to a Uint256 size in the log output (as L1 contract does) or maybe output in wei rather than gwei; but output packed it would currently be valid and parsable.

However the log would also fail the len(log.data) === 576 check, which first appears in this PR and nothing else prior; and the offset for SIGNATURE_OFFSET and INDEX_OFFSET would be incorrect.

Copy link
Contributor

@benaadams benaadams Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that on mainnet the deposit event length should be 576 bytes

What the mainnet contract does is no guarantee of what ABI compliant contracts that follow the spec do; or Sepolia wouldn't have broken.

What is in all prior specs is that these are all variable length fields; that the size is 576 is an implementation detail of the mainnet contract, however it is not guaranteed in any spec; which in fact specify they are variable length fields.

Mainnet output in gwei for example is a particularly of mainnet asset and the beacon chain requirements, but only L1 uses mainnet CLs/beacon chain.

EIPS/eip-6110.md Outdated
Parses deposit data from DepositContract.DepositEvent data.
"""

# TODO: verify byte length of deposit_event_data? Should be 576. Should this be part of the spec?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it has to be. The simplest is to require the length of the log data to be exactly this (otherwise, no request is noted).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops forgot to remove this TODO. It is currently at line 115 is_deposit_event = (len(log.topics) > 0 and log.topics[0] == DEPOSIT_EVENT_SIGNATURE_HASH and len(log.data) === 576). I'll remove the TODO :)

@jochem-brouwer
Copy link
Member Author

At the testing call tomorrow I'd like to discuss this PR to do a "temperature check" 😄 👍

@jochem-brouwer
Copy link
Member Author

jochem-brouwer commented Mar 18, 2025

Hi all, I have updated this EIP to now also "decode" the size from the raw event data. Note that on Mainnet/Sepolia/Holesky, the raw byte data length is fixed and the sizes are also static. However, since if you want to implement the EIP, you actually need to know how to decode the data. Here is sample event raw data from the EVM:

0x00000000000000000000000000000000000000000000000000000000000000a000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000140000000000000000000000000000000000000000000000000000000000000018000000000000000000000000000000000000000000000000000000000000002000000000000000000000000000000000000000000000000000000000000000030b7c7560d6cd9553118bec32c68f5741627460d9608cd94608eb6fd2b3140a153f682819622a44b59521572b97e42ddee0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000200100000000000000000000009f247b7e252c87f8ee117f641dcc86c5187325e800000000000000000000000000000000000000000000000000000000000000080040597307000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000006085da531b6be0f587b8b4de839386868640407a752687eda8a730a36f3512000d48736abf308228afe9d6660582adfb7b14c00819f49c1cbc68ea325676e94500bdad2c8bb1b435520676ce6079d544b27378d8ba221fde56fa80190f2d321eaf0000000000000000000000000000000000000000000000000000000000000008f632160000000000000000000000000000000000000000000000000000000000

https://holesky.etherscan.io/tx/0x0a47954e95551d665b60db4b5ff0f9ae08c7f3150dda0d4d83ed43454bdd8660

To decode this you will need some context, namely the solidity event having the signature DepositEvent(bytes,bytes,bytes,bytes,bytes). Note, if you change a variable type here, you change DEPOSIT_EVENT_SIGNATURE_HASH as well. So in order to decode this data (in sense of parse_deposit_data) you either need extra context or it needs to be specified in this EIP (and I think EIPs should be self-consistent, so lets specify it here)

Please carefully review 3c96c46

I am also open to some style tips, think it can be made even more clear 😄 👍

Thanks all!

@jochem-brouwer
Copy link
Member Author

Nvm I should update to read the offsets from the data also. Will fix

@jochem-brouwer jochem-brouwer force-pushed the eip-6110-deposit-decoding branch from c35d738 to 3c96c46 Compare March 18, 2025 23:19
@jochem-brouwer
Copy link
Member Author

Have amended the commit 3c96c46, should now read the offsets from the raw data (and not hardcoded constants)

EIPS/eip-6110.md Outdated
# These offsets points are data pointers which first encode the
# size of the data (which is always 32 bytes)
# The data itself (of this size) is then appended after the size 32 bytes
PUBKEY_OFFSET = deposit_event_data[0:32]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PUBKEY_OFFSET = deposit_event_data[0:32]
pubkey_offset = int.from_bytes(deposit_event_data[0:32], byteorder=‘big')

Since this is not constant anymore, it shouldn’t be capital case. Plus, i think we should parse an offset as int (same for len)

@mkalinin
Copy link
Contributor

If we switch this PR to dynamic lengths then I think we should remove conversion to fixed types in event_data_to_deposit_request and simply return concatenation of parsed event data

EIPS/eip-6110.md Outdated
WITHDRAWAL_CREDENTIALS_OFFSET = deposit_event_data[32:64]
AMOUNT_OFFSET = deposit_event_data[64:96]
SIGNATURE_OFFSET = deposit_event_data[96:128]
INDEX_OFFSET = deposit_event_data[128:160]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if any of these offsets or derived values is outside of the actual data length?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point, I will add in that if this is the case (the LOG is thus invalid) then it is thus also not a valid request and should be skipped

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does have the complication that it's "system tx" level so will fail the block (as per Sepolia)

Which flows into the other discussion about what happens when a system tx fails

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case it meant the contract produced an invalid log data thus invalidating a block would be reasonable.

@benaadams
Copy link
Contributor

If we switch this PR to dynamic lengths then I think we should remove conversion to fixed types in event_data_to_deposit_request and simply return concatenation of parsed event data

That could mean unknown offsets in the deposit requests due to variable sizes?

The conversion to fixed lengths is still important

@mkalinin
Copy link
Contributor

mkalinin commented Mar 20, 2025

The conversion to fixed lengths is still important

If we have fixed len conversion built into EL clients then there is no point for this EIP to support smart contracts with lengths different to the mainnet. In this case we should have a check that the log.data has the expected layout and invalidate a block if this check fails

@jochem-brouwer
Copy link
Member Author

Right, I gave this some more thought, if we start supporting contracts with different lengths or layout then this means that EEST will have to come up with tests to test this behavior (does not matter if we invalidate the block or report requests hashes). This also means that the pre-state for the EEST test will be a custom deposit contract. While this is the case for Sepolia vs Mainnet (different bytecode), the event layout and length of the deposit event does not change.

I think that ELs do not want to be "forced" to implement logic which is not possible on mainnet because it is now specced in the EIP (if we wanted to do this, I think we should have done this earlier).

So, also for the tests where we invalidate the block if the LOG/event emitted is malformed, this will also mean in order to test this, we need new deposit contract bytecode. This is thus not equal than the mainnet bytecode. We can (and should) check but I am rather sure the audit will also cover the event emitted and that this is well-defined and not malformed.

So I think the "manual ABI decoder" we have now is fine. I am also still fine with hardcoding the offsets and sizes.

@jochem-brouwer
Copy link
Member Author

I'll update this PR later today!

@jochem-brouwer
Copy link
Member Author

It's updated, please re-review :)

@jochem-brouwer jochem-brouwer marked this pull request as ready for review March 24, 2025 16:11
@jochem-brouwer jochem-brouwer requested a review from eth-bot as a code owner March 24, 2025 16:11
@jochem-brouwer jochem-brouwer marked this pull request as draft March 24, 2025 16:12
@jochem-brouwer
Copy link
Member Author

Changed to review but quickly reverted it back to draft since I'm not sure if eth-bot will auto merge (since there is one approval).

This is ready for re-review, the spec now does not specify how to decode the actual data, but it does explicitly state how to validate the layout of the log. I think this is the way to go, now any "exotic" bytecodes or other bytecodes will now produce invalid blocks if they: emit an event at the deposit address with the corresponding deposit topic, and the layout itself does not match (this is offsets and the lengths of the data) it will mark the block as invalid (so it is immediately clear that the EIP-6110 spec does not support the bytecode config on that chain (in case of an emitted deposit)

@jochem-brouwer jochem-brouwer marked this pull request as ready for review March 25, 2025 12:09
Co-authored-by: Mikhail Kalinin <[email protected]>
@jochem-brouwer jochem-brouwer force-pushed the eip-6110-deposit-decoding branch from 79d4c2c to 6b0eb53 Compare March 25, 2025 14:20
Copy link
Contributor

@benaadams benaadams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me

@eth-bot eth-bot enabled auto-merge (squash) March 27, 2025 04:18
Copy link
Collaborator

@eth-bot eth-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All Reviewers Have Approved; Performing Automatic Merge...

@eth-bot eth-bot merged commit 619221e into ethereum:master Mar 27, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a-review Waiting on author to review c-update Modifies an existing proposal t-core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants