Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
8689c6a
adding a new nccl window buffer manager that goes with the lifetime o…
nv-lschneider Nov 13, 2025
55a8b0b
namespace for test fixing
nv-lschneider Nov 14, 2025
4d5df84
linking stuff for tests
nv-lschneider Nov 14, 2025
95c5d22
moving NCCL helper
nv-lschneider Nov 14, 2025
2ab6750
using new NCCL util
nv-lschneider Nov 17, 2025
8e80a7a
remove NCCLUBAllocator
nv-lschneider Nov 17, 2025
98a67fb
Changing default strategy to NCCL_SYMMETRIC
nv-lschneider Nov 18, 2025
df49fdb
cleaning up and adding tests before PR
nv-lschneider Nov 18, 2025
00d7c24
fixing python tests
nv-lschneider Nov 18, 2025
a167243
fixing problems
nv-lschneider Nov 18, 2025
ae1e46b
test hardeing
nv-lschneider Nov 18, 2025
4a4d36d
redesign tests
nv-lschneider Nov 18, 2025
63d4762
one test after the other
nv-lschneider Nov 18, 2025
c99eed7
one test after the other
nv-lschneider Nov 18, 2025
3600b4a
asdf
nv-lschneider Nov 18, 2025
41c84b5
updating test
nv-lschneider Nov 19, 2025
27ef99b
fixing test pickling
nv-lschneider Nov 19, 2025
1201c50
wrapping
nv-lschneider Nov 19, 2025
78031fd
fix remaining tests
nv-lschneider Nov 19, 2025
d146269
adding simple arithmetic to the test
nv-lschneider Nov 19, 2025
1040446
add AR to the test
nv-lschneider Nov 19, 2025
f473e15
rename test sensibly
nv-lschneider Nov 19, 2025
e91e5f0
addressing coderabbit
nv-lschneider Nov 19, 2025
ce3efcd
more code rabbit comments
nv-lschneider Nov 19, 2025
14681f1
fixes
nv-lschneider Nov 19, 2025
ac22fef
addressing review comments
nv-lschneider Nov 21, 2025
4cef4db
adding empirical model explanation
nv-lschneider Nov 21, 2025
7a9da35
removing ncclWindowTensor python interface
nv-lschneider Nov 25, 2025
53e7ae8
registering the new test for CI
nv-lschneider Nov 25, 2025
6df9dfe
removing ncclWindowTensor from build system
nv-lschneider Nov 25, 2025
99fbbe5
querying MNNVL to determine if it is worth it to copy data
nv-lschneider Nov 25, 2025
f59f43c
fixes
nv-lschneider Nov 25, 2025
f4a0f84
cosmetic fixes
nv-lschneider Nov 25, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions cpp/tensorrt_llm/common/customAllReduceUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,6 @@ inline AllReduceStrategyType SelectStrategyLP(size_t seq_len, size_t hidden_size
{
return AllReduceStrategyType::ONESHOT;
}
return AllReduceStrategyType::NCCL;
}

// use 1D vector to store the best strategy instead of a map for each sm version
Expand Down Expand Up @@ -143,15 +142,15 @@ inline AllReduceStrategyType selectStrategyLookUpTable(
sm_version = 100;
}

// Check if the entry is out of bounds, otherwise return NCCL as fallback
// Check if the entry is out of bounds, otherwise return NCCL_SYMMETRIC as fallback
if (AllReduceBestStrategyTable.find(sm_version) == AllReduceBestStrategyTable.end()
|| tp_index >= AllReduceBestStrategyTable.at(sm_version).size()
|| fusion_op_index >= AllReduceBestStrategyTable.at(sm_version).at(tp_index).size()
|| hidden_size_index >= AllReduceBestStrategyTable.at(sm_version).at(tp_index).at(fusion_op_index).size()
|| num_token_index
>= AllReduceBestStrategyTable.at(sm_version).at(tp_index).at(fusion_op_index).at(hidden_size_index).size())
{
return AllReduceStrategyType::NCCL;
return AllReduceStrategyType::NCCL_SYMMETRIC;
}

return static_cast<AllReduceStrategyType>(
Expand Down
Loading