snapshot: Transfer files concurrently to speed up snapshot transfer #482

ambroff · 2025-02-06T11:08:47Z

Currently snapshot transfer happens by sending GetFileRequest for every file known to be in the remote snapshot. This happens sequentially for each file.

The only real configurations which allow tuning the throughput of this transfer are the throttle which can be set when initializing the braft::Node, or the runtime configuration raft_max_byte_count_per_rpc which determines how many chunks a large file will be broken into during the transfer. The default is 128KiB, so a 1MiB file will be transfered in about 8 GetFileRequests.

This works great for snapshots which have a handful of large files. But if a snapshot has hundreds or thousands of small files then transfer of these snapshots can be pretty slow.

I locally create a snapshot with 100k files on my development machine, for example, it can take up to 30 minutes to transfer all of those files in that snapshot. Even though the latency per transfer is low, there is a full round trip plus a flush of __raft_meta on the receiving end for each file.

This patch adds concurrency to these transfers. When a remote snapshot is transferred locally, up to raft_max_get_file_request_concurrency GetFileRequests will be sent concurrently. This defaults to 64.

With this patch, the 100k file snapshot consistently transfers in under 10 seconds on my development machine.

This should resolve #362.

Currently snapshot transfer happens by sending GetFileRequest for every file known to be in the remote snapshot. This happens sequentially for each file. The only real configurations which allow tuning the throughput of this transfer are the throttle which can be set when initializing the braft::Node, or the runtime configuration raft_max_byte_count_per_rpc which determines how many chunks a large file will be broken into during the transfer. The default is 128KiB, so a 1MiB file will be transfered in about 8 GetFileRequests. This works great for snapshots which have a handful of large files. But if a snapshot has hundreds or thousands of small files then transfer of these snapshots can be pretty slow. I locally create a snapshot with 100k files on my development machine, for example, it can take up to 30 minutes to transfer all of those files in that snapshot. Even though the latency per transfer is low, there is a full round trip plus a flush of __raft_meta on the receiving end for each file. This patch adds concurrency to these transfers. When a remote snapshot is transferred locally, up to raft_max_get_file_request_concurrency GetFileRequests will be sent concurrently. This defaults to 64. With this patch, the 100k file snapshot consistently transfers in under 10 seconds on my development machine. This should resolve baidu#362.

hnwyllmm · 2025-02-07T06:03:26Z

src/braft/snapshot.cpp

+          auto remaining = files_span.size() - i;
+          auto current_chunk_size = std::min(chunk_size, remaining);
+          auto chunk = files_span.subspan(i, current_chunk_size);
+          copy_files(chunk);


Copy 64(default value) files first, and then the next 64 files?
It seems not a good concurrency strategy.

You're right it is not ideal, but it is still a dramatic improvement and would have prevented a production incident recently.

I can see if there is an easy way to try to send requests as others complete so we don't do it in chunks. I'm worried that will require more fundamental changes. You can't just call Session::join() because you don't know which request will finish first. You could repeatedly poll Session::status() but that would be a waste of CPU. I think a callback / continuation would have to be added to Session to do this properly.

I guess another alternative is to just send all of them concurrently, but enable the snapshot throttle. But that could break anyone who doesn't use the snapshot throttle.

I got you. If RemoteFileCopier::Session has such routine as on_finish we can have a better solution.
But if the files have the same size (roughly), we can assume that the files in one chunk would be finished in the same time.

hnwyllmm reviewed Feb 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

snapshot: Transfer files concurrently to speed up snapshot transfer #482

snapshot: Transfer files concurrently to speed up snapshot transfer #482

ambroff commented Feb 6, 2025

hnwyllmm Feb 7, 2025

ambroff Feb 7, 2025 •

edited

Loading

ambroff Feb 7, 2025 •

edited

Loading

hnwyllmm Feb 8, 2025

snapshot: Transfer files concurrently to speed up snapshot transfer #482

Are you sure you want to change the base?

snapshot: Transfer files concurrently to speed up snapshot transfer #482

Conversation

ambroff commented Feb 6, 2025

hnwyllmm Feb 7, 2025

Choose a reason for hiding this comment

ambroff Feb 7, 2025 • edited Loading

Choose a reason for hiding this comment

ambroff Feb 7, 2025 • edited Loading

Choose a reason for hiding this comment

hnwyllmm Feb 8, 2025

Choose a reason for hiding this comment

ambroff Feb 7, 2025 •

edited

Loading

ambroff Feb 7, 2025 •

edited

Loading