-
Notifications
You must be signed in to change notification settings - Fork 499
PR3 coro_rpc_communicator python bindings #1106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add cororpc_communicator and cororpc_interface headers and implementation - Implement async RPC communication layer using yalantinglibs coro_rpc - Add CMake configuration for coro_rpc_connector module - Add bandwidth test script for performance validation - Support both TCP and RDMA transports via MC_RPC_PROTOCOL env var This introduces a new transport layer based on coro_rpc for high-performance asynchronous communication in the transfer engine. The implementation uses zero-copy techniques via attachments and provides both synchronous and asynchronous APIs.
Transfer Engine Python bindings: - Add coro_rpc interface integration to transfer_engine_py.cpp - Support transfer operations with notify mechanism - Add getNotifies() method for retrieving transfer notifications - Improve batch transfer sync APIs with notify support Store Python bindings: - Add batch_put_from_multi_buffers() for efficient batch operations - Add batch_get_into_multi_buffers() for batch retrieval - Enhance setup() with additional configuration options - Improve error handling and parameter validation Testing and utilities: - Add comprehensive test_tensor_api.py test script - Update mooncake_config.py with new configuration options - Update CMakeLists.txt for Python integration modules Note: This PR depends on: - PR1 (coro_rpc_connector) for transfer_engine_py.cpp - PR2 (Store enhancements) for store_py.cpp
Summary of ChangesHello @JasonZhang517, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the interoperability of the Mooncake project by integrating Python bindings for the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces Python bindings for the coro_rpc_communicator. The changes primarily involve updating CMakeLists.txt to incorporate the new dependencies and adding the necessary pybind11 wrapper code in transfer_engine_py.cpp. The Python bindings themselves are well-implemented. My review focuses on a couple of areas for improvement in the CMake script to remove redundancy and prevent potential build issues.
mooncake-integration/CMakeLists.txt
Outdated
| if (TARGET coro_rpc_connector) | ||
| target_link_libraries(engine PUBLIC coro_rpc_connector) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mooncake-integration/CMakeLists.txt
Outdated
| if (TARGET transfer_engine) | ||
| target_link_libraries(engine PUBLIC transfer_engine) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…unicator config 3. removed comment lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR attempts to add Python bindings for the coro_rpc_communicator component to the Mooncake transfer engine. However, the implementation has several critical issues that prevent it from compiling. The changes add new binding code in transfer_engine_py.cpp and update the CMakeLists.txt build configuration, but reference files and functions that don't exist in the codebase.
Key issues:
- References a non-existent header file (cororpc_interface.h) and undefined classes/functions
- Contains duplicate CMake target definitions that will cause build failures
- Uses an undefined CMake variable (ENGINE_MODULE_SOURCES)
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| mooncake-integration/transfer_engine/transfer_engine_py.cpp | Adds Python bindings for CoroRPCInterface, ReceivedData, and ReceivedTensor classes, plus factory functions. However, the included header and referenced symbols don't exist. |
| mooncake-integration/CMakeLists.txt | Updates build configuration with duplicate pybind11_add_module definitions, redundant Python3 package finding, and conditional linking to the coro_rpc_connector target. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| m.def("create_rpc_client", &createRPCClient, "local_rank"_a, | ||
| "world_size"_a); | ||
| m.def("create_rpc_server", &createRPCServer, "local_rank"_a, | ||
| "world_size"_a); |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The factory function names createRPCClient and createRPCServer use inconsistent casing compared to the existing pattern in the codebase. The existing rpc_interface.h uses createRpcClient and createRpcServer (with 'Rpc' in camelCase, not 'RPC' in all caps). For consistency, these should be named createCoroRpcClient and createCoroRpcServer to match the existing naming convention.
| m.def("create_rpc_client", &createRPCClient, "local_rank"_a, | ||
| "world_size"_a); | ||
| m.def("create_rpc_server", &createRPCServer, "local_rank"_a, | ||
| "world_size"_a); |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functions createRPCClient and createRPCServer are referenced but not declared or defined anywhere in the codebase. These factory functions need to be either: 1) declared and defined in the cororpc_interface.h header (similar to how createRpcClient and createRpcServer are defined in rpc_interface.h), or 2) removed from these bindings if they don't exist. Without proper declarations, this code will not compile.
| // Forward declaration for coro_rpc_interface binding function | ||
| void bind_coro_rpc_interface(py::module_ &m); |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The forward declaration of bind_coro_rpc_interface is immediately followed by its implementation in the same file. This forward declaration is unnecessary since the function is defined before it's used in the PYBIND11_MODULE. Consider removing the forward declaration and keeping only the implementation.
mooncake-integration/CMakeLists.txt
Outdated
| endif() | ||
| endif() | ||
|
|
||
| pybind11_add_module(engine ${ENGINE_MODULE_SOURCES}) |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable ENGINE_MODULE_SOURCES is used but never defined in this CMakeLists.txt file. This will cause the pybind11_add_module command to create an empty module or fail. Either define ENGINE_MODULE_SOURCES before this line, or replace it with the actual source files needed for the engine module (likely transfer_engine/transfer_engine_py.cpp).
mooncake-integration/CMakeLists.txt
Outdated
| endif() | ||
| endif() | ||
|
|
||
| pybind11_add_module(engine ${ENGINE_MODULE_SOURCES}) |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pybind11_add_module for 'engine' is defined twice in this file - once at line 35 (inside the WITH_TE conditional block) and again at line 65 (unconditionally). This creates a duplicate target definition which will cause a CMake configuration error. Remove one of these duplicate definitions. Since the one at line 65 appears to be the newer addition for coro_rpc support, you likely need to either merge the two definitions or ensure only one is active based on the build configuration.
| include_directories("./") | ||
| include_directories("../mooncake-transfer-engine/include") | ||
|
|
||
| find_package(Python3 COMPONENTS Interpreter Development REQUIRED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see line 46! It's duplicated.
| // Include coro_rpc_interface headers | ||
| #include "transport/coro_rpc_connector/cororpc_interface.h" | ||
|
|
||
| using namespace pybind11::literals; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
py::literals are already accessible.
|
Please address the issues above. |
|
Please fix compilation errors for a quick merge. THX |
feca777 to
b963438
Compare
|
code format.... |
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
| namespace py = pybind11; | ||
|
|
||
| namespace mooncake { | ||
| namespace py = pybind11; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicated
| .def_readonly("data_size", &RpcInterface::ReceivedData::data_size) | ||
| .def("get_bytes", &RpcInterface::ReceivedData::getBytes) | ||
| .def("get_memory_view", &RpcInterface::ReceivedData::getMemoryView); | ||
| safe_register_class([&]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need it.
Description
The python binding layer of coro_rpc_communicator. Refers to #1104
Type of Change
How Has This Been Tested?
Checklist