Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speeding up copying of snapshots #362

Open
kishorenc opened this issue Jun 3, 2022 · 8 comments · May be fixed by #482
Open

Speeding up copying of snapshots #362

kishorenc opened this issue Jun 3, 2022 · 8 comments · May be fixed by #482

Comments

@kishorenc
Copy link
Contributor

I've a cluster where the nodes are far apart geographically, so there is high network latency between the nodes. In such a case, I find the snapshot install from the leader to follower to be pretty slow for large datasets.

When I increase the FLAGS_raft_max_byte_count_per_rpc value, the trasfer became much faster.

  1. Are there any other flags that I can tweak to increase the speed of snapshot transfer?
  2. If a snapshot consists of multiple files, are the files transferred sequentially or parallely? Is there a way to make the file transfer parallel to max out available network bandwidth between nodes?
@PFZheng
Copy link
Collaborator

PFZheng commented Jun 6, 2022

I think there is no other way to speed up snapshot transfer, the files in snapshot are transferred sequentially.

@kishorenc
Copy link
Contributor Author

Thanks for the clarification. Would you accept a patch that possibly parallelized this operation?

@PFZheng
Copy link
Collaborator

PFZheng commented Jun 6, 2022

Thanks for the clarification. Would you accept a patch that possibly parallelized this operation?

Of course!

@kishorenc
Copy link
Contributor Author

@PFZheng

Here's a proposed approach that I want to run by you before making changes:

  • Make LocalSnapshotCopier::copy_file take a vector of filenames:
    void LocalSnapshotCopier::copy_file(const std::string& filename) {
  • Make this file copying code block run parallel on multiple threads (rest of the code is fast so can just remain sequential):
    scoped_refptr<RemoteFileCopier::Session> session
    = _copier.start_to_copy_to_file(filename, file_path, NULL);
    if (session == NULL) {
    LOG(WARNING) << "Fail to copy " << filename
    << " path: " << _writer->get_path();
    set_error(-1, "Fail to copy %s", filename.c_str());
    return;
    }
    _cur_session = session.get();
  • Wrap _copier and _cur_session into a struct and use a vector of structs so that each copy thread can have a separate copy state.

Please let me know if this sounds like a good approach? Also, is it okay to use std::thread -- if not, please point me to some examples in the code where threads are managed via bthread to use as reference.

@kishorenc
Copy link
Contributor Author

@PFZheng

Took a stab at this but ran into concurrency issues with FileServiceImpl::get_file which seems to have trouble allowing fetching files in parallel. Specifically, we got an error here when we tried to download multiple snapshot files in-parallel:

https://github.com/baidu/braft/blob/master/src/braft/file_service.cpp#L70

Here's the rough diff of the changes we've attempted: master...krunal1313:braft:master

Any guidance here is appreciated.

cc @chenzhangyi

@kishorenc
Copy link
Contributor Author

@PFZheng @chenzhangyi

I'm sorry to follow-up: I will really appreciate if you can provide any pointers here.

@chenzhangyi
Copy link
Collaborator

@kishorenc Which kind of error did you get?

@kishorenc
Copy link
Contributor Author

kishorenc commented Feb 13, 2023

This is the error we get with these changes.

W0123 12:03:47.506687 80182 external/com_github_brpc_braft/src/braft/snapshot.cpp:786] 
Fail to copy, error_code 22 error_msg [E22][10.13.87.194:8107][E22]Fail to read from path=/var/lib/app/state/snapshot/snapshot_00000000000000000369 filename=db_snapshot/000444.sst : 
Invalid argument writer path /var/lib/app/state/snapshot/temp

It seems like, on the remote end, multiple files cannot be accessed at the same time.

ambroff added a commit to ambroff/braft that referenced this issue Feb 6, 2025
Currently snapshot transfer happens by sending GetFileRequest for
every file known to be in the remote snapshot. This happens
sequentially for each file.

The only real configurations which allow tuning the throughput of this
transfer are the throttle which can be set when initializing the
braft::Node, or the runtime configuration raft_max_byte_count_per_rpc
which determines how many chunks a large file will be broken into
during the transfer. The default is 128KiB, so a 1MiB file will be
transfered in about 8 GetFileRequests.

This works great for snapshots which have a handful of large
files. But if a snapshot has hundreds or thousands of small files then
transfer of these snapshots can be pretty slow.

I locally create a snapshot with 100k files on my development machine,
for example, it can take up to 30 minutes to transfer all of those
files in that snapshot. Even though the latency per transfer is low,
there is a full round trip plus a flush of __raft_meta on the
receiving end for each file.

This patch adds concurrency to these transfers. When a remote snapshot
is transferred locally, up to raft_max_get_file_request_concurrency
GetFileRequests will be sent concurrently. This defaults to 64.

With this patch, the 100k file snapshot consistently transfers in
under 10 seconds on my development machine.

This should resolve baidu#362.
ambroff added a commit to ambroff/braft that referenced this issue Feb 6, 2025
Currently snapshot transfer happens by sending GetFileRequest for
every file known to be in the remote snapshot. This happens
sequentially for each file.

The only real configurations which allow tuning the throughput of this
transfer are the throttle which can be set when initializing the
braft::Node, or the runtime configuration raft_max_byte_count_per_rpc
which determines how many chunks a large file will be broken into
during the transfer. The default is 128KiB, so a 1MiB file will be
transfered in about 8 GetFileRequests.

This works great for snapshots which have a handful of large
files. But if a snapshot has hundreds or thousands of small files then
transfer of these snapshots can be pretty slow.

I locally create a snapshot with 100k files on my development machine,
for example, it can take up to 30 minutes to transfer all of those
files in that snapshot. Even though the latency per transfer is low,
there is a full round trip plus a flush of __raft_meta on the
receiving end for each file.

This patch adds concurrency to these transfers. When a remote snapshot
is transferred locally, up to raft_max_get_file_request_concurrency
GetFileRequests will be sent concurrently. This defaults to 64.

With this patch, the 100k file snapshot consistently transfers in
under 10 seconds on my development machine.

This should resolve baidu#362.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants