Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

remote-ext: improve state download performance on slow connections #14746

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

liamaharon
Copy link
Contributor

@liamaharon liamaharon commented Aug 10, 2023

Introduction

I experienced an apparent stall downloading state from https://rococo-try-runtime-node.parity-chains.parity.io:443 which was having networking difficulties only responding to my JSONRPC requests with 50-200KB/s of bandwidth.

This PR fixes the issue causing the stall, and generally improves performance remote-ext when it downloads state by greatly reducing the chances of a timeout occuring.

Description

Introduces a new REQUEST_DURATION_TARGET constant and modifies get_storage_data_dynamic_batch_size to

  1. Increase or decrease the batch size of the next request depending on whether the elapsed time of the last request was gt or lt the target
  2. Reset the batch size to 1 if the request times out

This fixes an issue on slow connections that can otherwise cause multiple timeouts and a stalled download:

  1. The batch size increases rapidly as remote-ext downloads keys with small associated storage values
  2. remote-ext tries to process a large series of subsequent keys all with extremely large associated storage values (Rococo has a series of keys 1-5MB large)
  3. The huge storage values download for 5 minutes until the request times out
  4. The partially downloaded keys are thrown out and remote-ext tries again with a smaller batch size, but the batch size is still far too large and takes 5 minutes to be reduced again
  5. The download will be essentially stalled for many hours while the above step cycles

After this PR, the request size will

  1. Not grow as large to begin with, as it is regulated downwards as the request duration exceeds the target
  2. Drop immediately to 1 if the request times out. A timeout indicates the keys next in line to download have extremely large storage values compared to previously downloaded keys, and we need to reset the batch size to figure out what our new ideal batch size is. By not resetting down to 1, we risk the next request timing out again.

@liamaharon liamaharon added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. T0-node This PR/Issue is related to the topic “node”. labels Aug 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. T0-node This PR/Issue is related to the topic “node”.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant