Skip to content

Conversation

@JasonZhang517
Copy link
Contributor

Description

The python binding layer of coro_rpc_communicator. Refers to #1104

Type of Change

  • Types
    • [ x ] Bug fix
    • New feature
      • Transfer Engine
      • Mooncake Store
      • Mooncake EP
      • Integration
      • P2P Store
      • Python Wheel
    • Breaking change
    • CI/CD
    • Documentation update
    • Other

How Has This Been Tested?

Checklist

  • I have performed a self-review of my own code.
  • I have updated the documentation.
  • I have added tests to prove my changes are effective.

luketong777 and others added 6 commits November 21, 2025 16:34
- Add cororpc_communicator and cororpc_interface headers and implementation
- Implement async RPC communication layer using yalantinglibs coro_rpc
- Add CMake configuration for coro_rpc_connector module
- Add bandwidth test script for performance validation
- Support both TCP and RDMA transports via MC_RPC_PROTOCOL env var

This introduces a new transport layer based on coro_rpc for
high-performance asynchronous communication in the transfer engine.
The implementation uses zero-copy techniques via attachments and
provides both synchronous and asynchronous APIs.
Transfer Engine Python bindings:
- Add coro_rpc interface integration to transfer_engine_py.cpp
- Support transfer operations with notify mechanism
- Add getNotifies() method for retrieving transfer notifications
- Improve batch transfer sync APIs with notify support

Store Python bindings:
- Add batch_put_from_multi_buffers() for efficient batch operations
- Add batch_get_into_multi_buffers() for batch retrieval
- Enhance setup() with additional configuration options
- Improve error handling and parameter validation

Testing and utilities:
- Add comprehensive test_tensor_api.py test script
- Update mooncake_config.py with new configuration options
- Update CMakeLists.txt for Python integration modules

Note: This PR depends on:
- PR1 (coro_rpc_connector) for transfer_engine_py.cpp
- PR2 (Store enhancements) for store_py.cpp
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @JasonZhang517, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the interoperability of the Mooncake project by integrating Python bindings for the coro_rpc_communicator. This enables Python-based components to seamlessly utilize the existing C++ RPC communication infrastructure, broadening the scope for development and integration within the ecosystem.

Highlights

  • Python Bindings for CoroRPCCommunicator: This pull request introduces a new Python binding layer for the coro_rpc_communicator, allowing Python applications to directly interact with the C++ CoroRPCInterface for remote procedure calls.
  • CMake Build System Updates: The CMake configuration has been updated to correctly build and link the new Python bindings. This includes finding Python3 development components, adding necessary include directories, and conditionally linking against the coro_rpc_connector library.
  • Exposed RPC Functionality: The Python bindings expose a comprehensive set of RPC functionalities, including methods for initializing clients and servers, sending and receiving data and tensors, and setting up asynchronous receive callbacks. This provides a full-featured Pythonic interface to the underlying C++ RPC system.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces Python bindings for the coro_rpc_communicator. The changes primarily involve updating CMakeLists.txt to incorporate the new dependencies and adding the necessary pybind11 wrapper code in transfer_engine_py.cpp. The Python bindings themselves are well-implemented. My review focuses on a couple of areas for improvement in the CMake script to remove redundancy and prevent potential build issues.

Comment on lines 60 to 83
if (TARGET coro_rpc_connector)
target_link_libraries(engine PUBLIC coro_rpc_connector)
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The object files from coro_rpc_connector are included on line 41 via $<TARGET_OBJECTS>. Linking the target again here is either redundant (for static libraries) or an error (for object libraries). Please remove this block to avoid redundancy and potential build failures.

Comment on lines 56 to 79
if (TARGET transfer_engine)
target_link_libraries(engine PUBLIC transfer_engine)
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The transfer_engine target is already linked to the engine target on line 51. This if block is redundant and can be removed to simplify the script.

Copilot AI review requested due to automatic review settings December 17, 2025 06:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR attempts to add Python bindings for the coro_rpc_communicator component to the Mooncake transfer engine. However, the implementation has several critical issues that prevent it from compiling. The changes add new binding code in transfer_engine_py.cpp and update the CMakeLists.txt build configuration, but reference files and functions that don't exist in the codebase.

Key issues:

  • References a non-existent header file (cororpc_interface.h) and undefined classes/functions
  • Contains duplicate CMake target definitions that will cause build failures
  • Uses an undefined CMake variable (ENGINE_MODULE_SOURCES)

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File Description
mooncake-integration/transfer_engine/transfer_engine_py.cpp Adds Python bindings for CoroRPCInterface, ReceivedData, and ReceivedTensor classes, plus factory functions. However, the included header and referenced symbols don't exist.
mooncake-integration/CMakeLists.txt Updates build configuration with duplicate pybind11_add_module definitions, redundant Python3 package finding, and conditional linking to the coro_rpc_connector target.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 766 to 769
m.def("create_rpc_client", &createRPCClient, "local_rank"_a,
"world_size"_a);
m.def("create_rpc_server", &createRPCServer, "local_rank"_a,
"world_size"_a);
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The factory function names createRPCClient and createRPCServer use inconsistent casing compared to the existing pattern in the codebase. The existing rpc_interface.h uses createRpcClient and createRpcServer (with 'Rpc' in camelCase, not 'RPC' in all caps). For consistency, these should be named createCoroRpcClient and createCoroRpcServer to match the existing naming convention.

Copilot uses AI. Check for mistakes.
Comment on lines 766 to 769
m.def("create_rpc_client", &createRPCClient, "local_rank"_a,
"world_size"_a);
m.def("create_rpc_server", &createRPCServer, "local_rank"_a,
"world_size"_a);
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functions createRPCClient and createRPCServer are referenced but not declared or defined anywhere in the codebase. These factory functions need to be either: 1) declared and defined in the cororpc_interface.h header (similar to how createRpcClient and createRpcServer are defined in rpc_interface.h), or 2) removed from these bindings if they don't exist. Without proper declarations, this code will not compile.

Copilot uses AI. Check for mistakes.
Comment on lines 719 to 720
// Forward declaration for coro_rpc_interface binding function
void bind_coro_rpc_interface(py::module_ &m);
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The forward declaration of bind_coro_rpc_interface is immediately followed by its implementation in the same file. This forward declaration is unnecessary since the function is defined before it's used in the PYBIND11_MODULE. Consider removing the forward declaration and keeping only the implementation.

Copilot uses AI. Check for mistakes.
endif()
endif()

pybind11_add_module(engine ${ENGINE_MODULE_SOURCES})
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable ENGINE_MODULE_SOURCES is used but never defined in this CMakeLists.txt file. This will cause the pybind11_add_module command to create an empty module or fail. Either define ENGINE_MODULE_SOURCES before this line, or replace it with the actual source files needed for the engine module (likely transfer_engine/transfer_engine_py.cpp).

Copilot uses AI. Check for mistakes.
endif()
endif()

pybind11_add_module(engine ${ENGINE_MODULE_SOURCES})
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pybind11_add_module for 'engine' is defined twice in this file - once at line 35 (inside the WITH_TE conditional block) and again at line 65 (unconditionally). This creates a duplicate target definition which will cause a CMake configuration error. Remove one of these duplicate definitions. Since the one at line 65 appears to be the newer addition for coro_rpc support, you likely need to either merge the two definitions or ensure only one is active based on the build configuration.

Copilot uses AI. Check for mistakes.
include_directories("./")
include_directories("../mooncake-transfer-engine/include")

find_package(Python3 COMPONENTS Interpreter Development REQUIRED)
Copy link
Collaborator

@stmatengss stmatengss Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see line 46! It's duplicated.

// Include coro_rpc_interface headers
#include "transport/coro_rpc_connector/cororpc_interface.h"

using namespace pybind11::literals;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

py::literals are already accessible.

@stmatengss
Copy link
Collaborator

Please address the issues above.

@stmatengss
Copy link
Collaborator

Please fix compilation errors for a quick merge. THX

@stmatengss
Copy link
Collaborator

code format....

@codecov-commenter
Copy link

codecov-commenter commented Jan 4, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 0% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...integration/transfer_engine/transfer_engine_py.cpp 0.00% 3 Missing ⚠️
...rc/transport/rpc_communicator/rpc_communicator.cpp 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

namespace py = pybind11;

namespace mooncake {
namespace py = pybind11;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicated

@JasonZhang517
Copy link
Contributor Author

.def_readonly("data_size", &RpcInterface::ReceivedData::data_size)
.def("get_bytes", &RpcInterface::ReceivedData::getBytes)
.def("get_memory_view", &RpcInterface::ReceivedData::getMemoryView);
safe_register_class([&]() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants