Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-nodes example & update doc #455

Merged
merged 13 commits into from
Feb 1, 2025
Merged

Add multi-nodes example & update doc #455

merged 13 commits into from
Feb 1, 2025

Conversation

Binyang2014
Copy link
Contributor

@Binyang2014 Binyang2014 commented Jan 22, 2025

Documentation update:

New example script:

IR module improvements:

  • python/mscclpp/language/ir.py: Refined the sorting criteria for GPU instance channels and thread block channels to include the channel type, ensuring a more accurate order.
    Debugging enhancements:

  • src/executor/executor.cc: Added a debug log to indicate the start of communication collective execution with details about the execution plan and collective.

  • src/include/debug.h: Introduced a new debug log subsystem identifier MSCCLPP_EXECUTOR for logging executor-related information.

@Binyang2014 Binyang2014 changed the title Update the document link Update the example link Jan 22, 2025
@Binyang2014 Binyang2014 changed the title Update the example link Add multi-nodes example & update doc Jan 23, 2025
@Binyang2014 Binyang2014 marked this pull request as ready for review January 25, 2025 00:51

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 3 out of 6 changed files in this pull request and generated no comments.

Files not reviewed (3)
  • python/test/configs/mscclpp_lang_test_config.json: Language not supported
  • src/executor/executor.cc: Language not supported
  • src/include/debug.h: Language not supported
Comments suppressed due to low confidence (1)

python/examples/allgather_allpairs_multinodes_packets.py:11

  • The docstring should clearly explain the purpose of the gpus_per_node argument and ensure that the steps listed are complete or rephrase to avoid implying there are more steps.
def allgather_multinodes_allpair(gpus, gpus_per_node, instances):
@Binyang2014 Binyang2014 requested a review from seagater January 25, 2025 00:52
docker/build.sh Show resolved Hide resolved
@Binyang2014 Binyang2014 merged commit 7f3b088 into main Feb 1, 2025
12 of 14 checks passed
@Binyang2014 Binyang2014 deleted the binyli/doc branch February 1, 2025 01:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants