-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update EIP-6110: Spec parse_deposit_data
#9460
Update EIP-6110: Spec parse_deposit_data
#9460
Conversation
✅ All reviewers have approved. |
parse_deposit_data
parse_deposit_data
parse_deposit_data
parse_deposit_data
EIPS/eip-6110.md
Outdated
Parses deposit data from DepositContract.DepositEvent data. | ||
""" | ||
|
||
# TODO: verify byte length of deposit_event_data? Should be 576. Should this be part of the spec? (Yes?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the failure mode for this check? Maybe we should do it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this also! 4f912ac 😄 👍
EIPS/eip-6110.md
Outdated
# which represents the dynamic size of the encoded bytes event data | ||
# The size itself is encoded in a 32-byte, so skip over this as well | ||
# to arrive at the first actual data offset to be read (pubkey) | ||
currentOffset = 32 * 5 + 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have PUBKEY_OFFSET, WITHDRAWAL_CREDENTIALS_OFFSET, etc instead of these computations? and then:
pubkey = deposit_event_data[PUBKEY_OFFSET:PUBKEY_OFFSET + 48]
withdrawal_credentials = deposit_event_data[WITHDRAWAL_CREDENTIALS_OFFSET: WITHDRAWAL_CREDENTIALS_OFFSET + 32]
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have added! Please verify if my offsets and sizes are correct 😄 👍
EIPS/eip-6110.md
Outdated
pubkey = deposit_event_data[currentOffset:currentOffset + 48] | ||
# The ABI data is encoded in 32-byte chunks, so the 64 bytes are reserved for the data | ||
# The next 32 bytes are reserved for the size of the next item (withdrawals) so skip over this as well | ||
currentOffset += 48 + 16 + 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using an ABI isn't particularly important; what is important is the data fields are length prefixed byte arrays. The fixed offsets are a quirk of the current contracts.
Properly parsing (using length prefix) means it would still work if a contract on a different chain output the same data but trimmed leading zeros; however using fixed offsets would break (if the argument is not to use ABI because "it might change"; changing would first break the offsets) (edited)
[3:06 PM]
The data is literally saying the lengths of the data elements; seem irresponsible to ignore it and lay a presumed offset format on it? If it was C that's how you'd get buffer over/under runs.
What if the deposit contract was written in viper or some other language that decided to encode packed? Is valid by the event log specification, would fail to be parsed.
It's also both literally encoding ABI and also ignoring it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point regarding the ABI, I should not have added this as comment at all. Have removed :)
The goal this PR wants to solve is that parse_deposit_data
is currently not specced. This PR should spec it. We have two approaches (and we should decide async or via ACD-E):
- Hardcoded offsets/sizes (this PR)
- Use ABI decoding
Note that the current EIP spec has this in it:
#### Deposit request
The structure denoting the new deposit request consists of the following fields:
1. `pubkey: Bytes48`
2. `withdrawal_credentials: Bytes32`
3. `amount: uint64`
4. `signature: Bytes96`
5. `index: uint64`
This is how the EIP-7685 request_data
is being encoded. (This would also raise the question for me: why does the Deposit event not encode this directly in the event signature? But this is a side-question, we can't change this anyways, and neither can we change that on mainnet these bytes
are always at the same offset and same size for all event data of the deposit event)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the OP of this PR also regarding this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this EIP is encoding that anything decoding using the specified ABI should fail if junky data was passed but the offset would "pass" which seems sloppy? Since the data would be wrong for anyone parsing the event upstream (and would the data even be correct in that circumstance?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure if I got this comment right but I think this applies here as well: #9460 (comment)
85f70b7
to
4f912ac
Compare
EIPS/eip-6110.md
Outdated
@@ -97,7 +112,7 @@ def get_deposit_request_data(receipts) | |||
deposit_requests = [] | |||
for receipt in receipts: | |||
for log in receipt.logs: | |||
is_deposit_event = (len(log.topics) > 0 and log.topics[0] == DEPOSIT_EVENT_SIGNATURE_HASH) | |||
is_deposit_event = (len(log.topics) > 0 and log.topics[0] == DEPOSIT_EVENT_SIGNATURE_HASH and len(log.data) === 576) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when its the correct contract, correct signature but log.data
is not exactly 576 bytes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It skips the log. Note that the current spec tells us how to encode the request (no changes to it in this PR):
1. `pubkey: Bytes48`
2. `withdrawal_credentials: Bytes32`
3. `amount: uint64`
4. `signature: Bytes96`
5. `index: uint64`
Note that if these value lengths are ABI encoded you would still end up with the 576 length event, because:
- 5 offset
Bytes32
(160 bytes total) - 5 size
Bytes32
(160 bytes total) pubkey
: 48 bytes, but will get rounded up to 64 byteswithdrawal_credentials
: 32 bytesamount
: will get rounded up to 32 bytessignature
: 96 bytesindex
: will get rounded up to 32 bytes
So, in total (5+5) * 32 + 64 + 32 + 32 + 96 + 32 = 576 bytes. If we would take, for instance, the pubkey
and would now encode the actual bytes data as more than 64 bytes, then we would thus get extra 32 bytes extra in the log data, however, this cannot be encoded in the request as Bytes48
anymore (unless we want to ignore any data).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the current spec tells us how to encode the request
The spec does not specify lengths and offsets, it specifies that the input values are variable https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/deposit-contract.md#staking-deposit-contract
The deposit contract has a public deposit function to make deposits. It takes as arguments
bytes
calldata pubkey,bytes
calldata withdrawal_credentials,bytes
calldata signature,bytes32
deposit_data_root
and also that the logs are variable
Every deposit emits a
DepositEvent
log for consumption by the beacon chain. The deposit contract does little validation, pushing most of the validator onboarding logic to the beacon chain.
The L1 smart contract also specify that the types are variable https://etherscan.io/address/0x00000000219ab540356cbb839cbe05303d7705fa#code
Both in the input values
function deposit(
bytes calldata pubkey,
bytes calldata withdrawal_credentials,
bytes calldata signature,
bytes32 deposit_data_root
) override external payable {
and in the log
event DepositEvent(
bytes pubkey,
bytes withdrawal_credentials,
bytes amount,
bytes signature,
bytes index
);
Where is it specified that they are fixed sized buffers of Bytes48
, Bytes32
, uint64
, Bytes96
, uint64
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I should have been more clear, I was not referring to the solidity spec, but to the encoding spec of the actual request (so the request which gets in the block header) of this EIP. See: https://eips.ethereum.org/EIPS/eip-6110#execution-layer and then to the "Deposit request" part:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok but those are post parsing the log data? i.e. parse_deposit_data
and is about ho to convert to fixed sizes (for output)
Before that where is the len(log.data) === 576
guarantee in any prior spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There technically is none, but, the deposit event hash is now hardcoded in clients to filter out non-DepositEvent
logs (so the Sepolia bug)
So this is given:
event DepositEvent(
bytes pubkey,
bytes withdrawal_credentials,
bytes amount,
bytes signature,
bytes index
);
Note: keccak256("DepositEvent(bytes,bytes,bytes,bytes,bytes)="0x649bbc62d0e31342afea4e5cd82d4049e7e1ee912fc0889aa790803be39038c5"
By ABI encoding as mentioned here: #9460 (comment) if we keep the way how requests are encoded, then if more data would get added to one of these parameters, it would not fit in the requests data (and we thus need to edit how the request itself gets encoded in the header)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side-note, this extra check is taken from Geth's code: https://github.com/ethereum/go-ethereum/blob/cd78b65cdaa9e00866962e38685c4f93ac9e8444/core/types/deposit.go#L28
Note that on mainnet the deposit event length should be 576 bytes always (implied by the solidity implementation, did not check bytecode)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keccak256("DepositEvent(bytes,bytes,bytes,bytes,bytes)="0x649bbc62d0e31342afea4e5cd82d4049e7e1ee912fc0889aa790803be39038c5"
Yes and bytes
are variable length values; not fixed length. So the ABI does not have fixed position offsets and lengths.
The L1 code does have require
statements specifying the length of the params; however we know from 3 other chains (Sepolia, Chaido and Gnosis) that the internal L1 contract implementation is not a guarenetted spec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. if the contract accepted 1ETH deposits 0xDE0B6B3A7640000
and didn't force it to a Uint256
size in the log output (as L1 contract does) or maybe output in wei
rather than gwei
; but output packed it would currently be valid and parsable.
However the log would also fail the len(log.data) === 576
check, which first appears in this PR and nothing else prior; and the offset for SIGNATURE_OFFSET
and INDEX_OFFSET
would be incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that on mainnet the deposit event length should be 576 bytes
What the mainnet contract does is no guarantee of what ABI compliant contracts that follow the spec do; or Sepolia wouldn't have broken.
What is in all prior specs is that these are all variable length fields; that the size is 576 is an implementation detail of the mainnet contract, however it is not guaranteed in any spec; which in fact specify they are variable length fields.
Mainnet output in gwei
for example is a particularly of mainnet asset and the beacon chain requirements, but only L1 uses mainnet CLs/beacon chain.
EIPS/eip-6110.md
Outdated
Parses deposit data from DepositContract.DepositEvent data. | ||
""" | ||
|
||
# TODO: verify byte length of deposit_event_data? Should be 576. Should this be part of the spec? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it has to be. The simplest is to require the length of the log data to be exactly this (otherwise, no request is noted).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops forgot to remove this TODO. It is currently at line 115 is_deposit_event = (len(log.topics) > 0 and log.topics[0] == DEPOSIT_EVENT_SIGNATURE_HASH and len(log.data) === 576)
. I'll remove the TODO :)
At the testing call tomorrow I'd like to discuss this PR to do a "temperature check" 😄 👍 |
Hi all, I have updated this EIP to now also "decode" the size from the raw event data. Note that on Mainnet/Sepolia/Holesky, the raw byte data length is fixed and the sizes are also static. However, since if you want to implement the EIP, you actually need to know how to decode the data. Here is sample event raw data from the EVM:
https://holesky.etherscan.io/tx/0x0a47954e95551d665b60db4b5ff0f9ae08c7f3150dda0d4d83ed43454bdd8660 To decode this you will need some context, namely the solidity event having the signature Please carefully review 3c96c46 I am also open to some style tips, think it can be made even more clear 😄 👍 Thanks all! |
Nvm I should update to read the offsets from the data also. Will fix |
c35d738
to
3c96c46
Compare
Have amended the commit 3c96c46, should now read the offsets from the raw data (and not hardcoded constants) |
EIPS/eip-6110.md
Outdated
# These offsets points are data pointers which first encode the | ||
# size of the data (which is always 32 bytes) | ||
# The data itself (of this size) is then appended after the size 32 bytes | ||
PUBKEY_OFFSET = deposit_event_data[0:32] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PUBKEY_OFFSET = deposit_event_data[0:32] | |
pubkey_offset = int.from_bytes(deposit_event_data[0:32], byteorder=‘big') |
Since this is not constant anymore, it shouldn’t be capital case. Plus, i think we should parse an offset as int
(same for len)
If we switch this PR to dynamic lengths then I think we should remove conversion to fixed types in |
EIPS/eip-6110.md
Outdated
WITHDRAWAL_CREDENTIALS_OFFSET = deposit_event_data[32:64] | ||
AMOUNT_OFFSET = deposit_event_data[64:96] | ||
SIGNATURE_OFFSET = deposit_event_data[96:128] | ||
INDEX_OFFSET = deposit_event_data[128:160] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if any of these offsets or derived values is outside of the actual data length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point, I will add in that if this is the case (the LOG is thus invalid) then it is thus also not a valid request and should be skipped
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does have the complication that it's "system tx" level so will fail the block (as per Sepolia)
Which flows into the other discussion about what happens when a system tx fails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case it meant the contract produced an invalid log data thus invalidating a block would be reasonable.
That could mean unknown offsets in the deposit requests due to variable sizes? The conversion to fixed lengths is still important |
If we have fixed len conversion built into EL clients then there is no point for this EIP to support smart contracts with lengths different to the mainnet. In this case we should have a check that the |
Right, I gave this some more thought, if we start supporting contracts with different lengths or layout then this means that EEST will have to come up with tests to test this behavior (does not matter if we invalidate the block or report requests hashes). This also means that the pre-state for the EEST test will be a custom deposit contract. While this is the case for Sepolia vs Mainnet (different bytecode), the event layout and length of the deposit event does not change. I think that ELs do not want to be "forced" to implement logic which is not possible on mainnet because it is now specced in the EIP (if we wanted to do this, I think we should have done this earlier). So, also for the tests where we invalidate the block if the LOG/event emitted is malformed, this will also mean in order to test this, we need new deposit contract bytecode. This is thus not equal than the mainnet bytecode. We can (and should) check but I am rather sure the audit will also cover the event emitted and that this is well-defined and not malformed. So I think the "manual ABI decoder" we have now is fine. I am also still fine with hardcoding the offsets and sizes. |
I'll update this PR later today! |
It's updated, please re-review :) |
Changed to review but quickly reverted it back to draft since I'm not sure if eth-bot will auto merge (since there is one approval). This is ready for re-review, the spec now does not specify how to decode the actual data, but it does explicitly state how to validate the layout of the log. I think this is the way to go, now any "exotic" bytecodes or other bytecodes will now produce invalid blocks if they: emit an event at the deposit address with the corresponding deposit topic, and the layout itself does not match (this is offsets and the lengths of the data) it will mark the block as invalid (so it is immediately clear that the EIP-6110 spec does not support the bytecode config on that chain (in case of an emitted deposit) |
Co-authored-by: Mikhail Kalinin <[email protected]>
79d4c2c
to
6b0eb53
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All Reviewers Have Approved; Performing Automatic Merge...
EIP-6110 currently does not specify how to parse the deposit event data. This PR adds the specific layout of the mainnet (and Sepolia) log data to the spec. If the deposit contract logs a DepositEvent but the layout is incorrect, the block will be marked as invalid, because the deposit contract does not adhere the deposit event layout.
Note that this approach does only spec the layout of the log: the actual bytecode of the contract is not specified. Therefore, mainnet, sepolia and holesky all match this spec. (Mainnet/Holesky bytecode is the same: Sepolia is however a "permissioned" ERC20-token contract (instead of accepting native Eth deposits) and thus has different bytecode. This thus allows any bytecode contract to be valid, given that the
DepositEvent
as being LOG-ed having the layout per this PR)