Optimize block time to head #5413

dapplion · 2023-04-25T06:04:18Z

In unstable metrics show that the time to process a block = seen in gossip until set as head ranges 600-900ms. This includes calling the execution client to notify the payload. Running the state transition function for a block takes ~50ms and computing the head another ~50ms. So the long process times are due to the shared resources with other items like processing attestations.

Block process timeline

Step	Avg time
Gossip validation	30 ms
Send notify payload	10 ms
State transition	45 ms
State hashing	25 ms
Verify block sigs	25 ms
notify new payload idle	300 ms
import attestations	10 ms
recompute head	100 ms
persist LC data	25 ms
total	570 ms

CPU profiles show that attestation processing interlaces with block processing, It's unclear now if the 300 ms time for the new payload response is caused by that or not. See two examples of idle and busy loop waiting for notify new payload response

Research / gather more data

What's the actual notify new payload time from the point of view of the execution client
- On unstable mainnet, average time is 100ms

rpc_duration_engine_newPayloadV2_success {quantile="0.5"} 104116.5
rpc_duration_engine_newPayloadV2_success {quantile="0.75"} 132396.75
rpc_duration_engine_newPayloadV2_success {quantile="0.95"} 225751.89999999988
rpc_duration_engine_newPayloadV2_success {quantile="0.99"} 364260.3900000002
rpc_duration_engine_newPayloadV2_success {quantile="0.999"} 1.0846435890000006e+06
rpc_duration_engine_newPayloadV2_success {quantile="0.9999"} 1.090077e+06

If significantly less than 300ms, add a snooper to watch traffic and understand where is the delay
At what point into the slot is the new payload HTTP request actually dispatched from the OS?

Optimization targets

Algorithmic optimization of recompute head
~~Stack recompute head with new payload waiting~~ - investigated, found that it doesn't improve timing
~~Complete block import before new payload response setting it as optimistic head and set as VALID on response latter~~ - likely to cause missed attestations
Block network processor queues when importing recent blocks. To allocate all cycles to block processor without interlaced attestations
Optimize persist LC data, or schedule for latter to complete import faster

CC @tuyennhv

The text was updated successfully, but these errors were encountered:

This was referenced Apr 26, 2023

Do not serialize gossip block when persisting to db #5422

Closed

Investigate high execution payload verification time #5429

Closed

feat: block network processor when processing current slot block #5458

Merged

philknows assigned twoeths Nov 7, 2023

philknows added the prio-medium Resolve this some time soon (tm). label Nov 7, 2023

wemeetagain unassigned twoeths Oct 10, 2024

philknows added the scope-performance Performance issue and ideas to improve performance. label Oct 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize block time to head #5413

Optimize block time to head #5413

dapplion commented Apr 25, 2023 •

edited by philknows

Loading

Optimize block time to head #5413

Optimize block time to head #5413

Comments

dapplion commented Apr 25, 2023 • edited by philknows Loading

dapplion commented Apr 25, 2023 •

edited by philknows

Loading