-
Notifications
You must be signed in to change notification settings - Fork 0
Implementing hierarchical distributor #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
dedbdc9 to
5993b7b
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2 +/- ##
==========================================
Coverage 100.00% 100.00%
==========================================
Files 13 17 +4
Lines 627 937 +310
Branches 66 93 +27
==========================================
+ Hits 627 937 +310
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
1ebad76 to
69ca93c
Compare
69ca93c to
f65e640
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a hierarchical work distributor as an alternative to the naive distributor, introducing a tree-based task distribution pattern for improved scalability in MPI environments. The hierarchical approach organizes workers in a coordinator-worker hierarchy to reduce communication overhead with the root manager.
Key changes include:
- Implementation of
HierarchicalMPIWorkDistributorwith tree-based task distribution - Addition of utility headers for assertions and printing support
- Updates to test suite to handle both ordered and unordered result patterns
- Enhanced test coverage with additional MPI rank configurations
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| include/dynampi/impl/hierarchical_distributor.hpp | Core implementation of hierarchical work distribution with coordinator-worker tree structure |
| include/dynampi/utilities/assert.hpp | Custom assertion macros with MPI-aware error reporting and source location support |
| include/dynampi/utilities/printing.hpp | Stream operators for common container types to support debugging output |
| test/mpi/test_distributers.cpp | Updated tests to handle both distributors with conditional logic for ordered/unordered results |
| include/dynampi/impl/naive_distributor.hpp | Minor refactoring to use communicator probe method |
| include/dynampi/mpi/mpi_communicator.hpp | Added probe method for non-blocking message detection |
| test/lsan.supp | Additional leak suppression entries for MPI library components |
| test/CMakeLists.txt | Extended test matrix to include 16-rank configuration |
| include/dynampi/mpi/mpi_types.hpp | Added string header include |
| benchmark/asymptotic_distribution_throughput.cpp | Commented out worker task count statistics output |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| if constexpr (statistics_mode >= StatisticsMode::Aggregated) { | ||
| if (is_root_manager()) _statistics.worker_task_counts.resize(_communicator.size(), 0); | ||
| } else { | ||
| std::cout << "Not auto-running worker on rank " << _communicator.rank() << std::endl; |
Copilot
AI
Aug 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug output statements should be removed from production code or made conditional based on a debug flag. These cout statements will create unnecessary output in production environments.
| if (is_root_manager()) _statistics.worker_task_counts.resize(_communicator.size(), 0); | ||
| } else { | ||
| std::cout << "Not auto-running worker on rank " << _communicator.rank() << std::endl; | ||
| } |
Copilot
AI
Aug 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug output statements should be removed from production code or made conditional based on a debug flag. These cout statements will create unnecessary output in production environments.
| } | ||
| std::cout << child << " "; | ||
| } | ||
| std::cout << std::endl; |
Copilot
AI
Aug 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug output statements should be removed from production code or made conditional based on a debug flag. These cout statements will create unnecessary output in production environments.
| if constexpr (statistics_mode >= StatisticsMode::Aggregated) { | ||
| _statistics.worker_task_counts[worker]++; | ||
| if (request.num_tasks_requested.has_value()) { | ||
| std::cout << "Allocating batch " << std::endl; |
Copilot
AI
Aug 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug output statements should be removed from production code or made conditional based on a debug flag. These cout statements will create unnecessary output in production environments.
| std::cout << "Allocating batch " << std::endl; | |
| if (_debug) { | |
| std::cout << "Allocating batch " << std::endl; | |
| } |
| _communicator.send(tasks, worker, Tag::TASK_BATCH); | ||
| _tasks_sent_to_child += tasks.size(); | ||
| } else { | ||
| std::cout << "Allocating single " << std::endl; |
Copilot
AI
Aug 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug output statements should be removed from production code or made conditional based on a debug flag. These cout statements will create unnecessary output in production environments.
| std::cout << "Allocating single " << std::endl; | |
| if (debug) { | |
| std::cout << "Allocating single " << std::endl; | |
| } |
d05ab60 to
ce38e0f
Compare
1ec6669 to
5c62ccb
Compare
2199113 to
7451805
Compare
7451805 to
82d0ba0
Compare
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughConverts CI tests from parallel to verbose serial; expands .gitignore. Generalizes benchmark build and adds a new pingpong benchmark. Overhauls hierarchical distributor to batch-based topology; refactors naive distributor internals. Enhances MPI communicator with probe/gather. Adds assertion and printing utilities. Updates MPI types include. Extends tests and LSAN suppressions. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant M as Manager (root)
participant C as Coordinator
participant W as Leaf Worker
rect rgb(245,250,255)
note right of M: Initialization/topology setup
M->>C: TASK_BATCH (initial tasks)
end
loop Work requests
W->>C: REQUEST_BATCH (n)
alt Tasks available
C->>W: TASK_BATCH (<= n)
W->>W: Execute batch
W-->>C: RESULT_BATCH
else No tasks left
C-->>W: DONE
end
end
rect rgb(250,245,255)
note over C,M: Upstream aggregation
C-->>M: RESULT_BATCH (child aggregate)
end
M-->>C: DONE (propagate when complete)
C-->>W: DONE
sequenceDiagram
autonumber
participant A as Rank A
participant B as Rank B
participant R0 as Rank 0 (collector)
note over A,B: For each size × method<br/>warmup then timed iters
A->>B: send/isend/bsend/ssend (ping)
B-->>A: reply (pong)
A-->>A: measure RTT & send-time
B-->>A: send back B->A results
par Pair barrier
A->>B: MPI_Barrier
B->>A: MPI_Barrier
end
A-->>R0: MPI_Gatherv CSV lines
B-->>R0: MPI_Gatherv CSV lines
R0-->>R0: Write CSV
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
include/dynampi/impl/hierarchical_distributor.hpp (1)
8-17: Fix missing standard headers; remove unused iostream.Compilation will fail due to missing headers for std::optional, std::deque (used in a type-trait check), and std::monostate. Also, is included but not used.
Apply:
#include <algorithm> #include <cassert> #include <functional> -#include <iostream> #include <iterator> #include <ranges> #include <span> #include <stack> #include <type_traits> #include <vector> +#include <optional> +#include <variant> +#include <deque>
♻️ Duplicate comments (3)
include/dynampi/utilities/printing.hpp (1)
70-82: Set printing: trailing-comma issue resolved — looks good.Iteration and delimiter handling are correct and handle empty sets without extra commas. This addresses the prior concern raised in earlier review comments.
include/dynampi/utilities/assert.hpp (1)
125-130: DYNAMPI_FAIL is UB in release (NDEBUG) — fix to terminate instead.In NDEBUG builds, DYNAMPI_FAIL expands to UNREACHABLE with no side effect. If executed, this is undefined behavior and can lead to miscompilations. In debug builds, the UNREACHABLE after a throw is also unnecessary.
-#define DYNAMPI_FAIL(...) \ - DYNAMPI_ASSERT(false, __VA_ARGS__); \ - DYNAMPI_UNREACHABLE() // LCOV_EXCL_LINE +#ifdef NDEBUG +#define DYNAMPI_FAIL(...) \ + do { \ + std::terminate(); \ + DYNAMPI_UNREACHABLE(); /* for optimizer hints */ \ + } while (false) +#else +#define DYNAMPI_FAIL(...) \ + do { \ + DYNAMPI_ASSERT(false, __VA_ARGS__); \ + } while (false) +#endiftest/mpi/test_distributers.cpp (1)
103-109: Redundant assert alongside EXPECT_EQThe raw
assert(result == std::vector<size_t>({0, 1, 4, 9}));is redundant with the followingEXPECT_EQ. Prefer one assertion style to avoid duplicate failures and keep output consistent.- assert(result == std::vector<size_t>({0, 1, 4, 9})); EXPECT_EQ(result, std::vector<size_t>({0, 1, 4, 9}));
🧹 Nitpick comments (28)
include/dynampi/mpi/mpi_types.hpp (1)
101-111: Confirm C++ standard; make std::string ptr portable pre-C++17.MPI_Typestd::string::ptr uses str.data() as a writable buffer. That requires C++17+, where data() returns char*. If any targets still compile with C++14, this is ill-formed and may UB if cast away. Either (a) ensure -std=c++17 or higher across the project, or (b) switch to &str[0] with an empty check for a portable writable pointer.
Apply this diff for option (b):
static int count(const std::string& str) { return static_cast<int>(str.size()); } static void resize(std::string& str, int new_size) { str.resize(static_cast<size_t>(new_size)); } - static void* ptr(std::string& str) noexcept { return str.data(); } - static const void* ptr(const std::string& str) noexcept { return str.data(); } + static void* ptr(std::string& str) noexcept { + return str.empty() ? nullptr : static_cast<void*>(&str[0]); + } + static const void* ptr(const std::string& str) noexcept { return str.data(); }If you confirm the project is C++17+, feel free to keep the current implementation and ignore this refactor.
.github/workflows/sanitizers.yml (1)
49-49: Switching tests to verbose serial runs is appropriate for sanitizer signal fidelityRunning ctest serially with --verbose improves attribution of sanitizer reports. If you ever need to guarantee serialization regardless of future ctest defaults, you can add -j 1 explicitly.
Apply this minimal tweak if you want the guarantee:
-ctest --output-on-failure --verbose -C Debug +ctest --output-on-failure --verbose -j 1 -C Debugtest/lsan.supp (1)
6-19: Be surgical with suppressions; document versions and rationaleThe added frames look OpenMPI/PMIx/hwloc related and are often benign allocator residue, but broad entries like leak:strdup and leak:opal_vasprintf can mask real leaks in your code paths that traverse those functions. Consider:
- Adding brief comments grouping entries by dependency and version (e.g., OpenMPI 4.x on Ubuntu 24.04), with links to upstream issues if available.
- Scoping to the narrowest symbol names possible; prefer library-specific frames over generic C library functions where feasible.
Would you like me to draft a commented, grouped suppression file keyed by detected MPI/PMIx/hwloc versions?
benchmark/asymptotic_distribution_throughput.cpp (2)
77-84: Result payload no longer depends on --message_size; CLI option now has no effectBy switching Result to size_t and returning the task, the --message_size and message_size_list options no longer influence communication volume or work cost. If that’s intentional, consider removing or deprecating the flags to avoid confusion; if not, reintroduce a payload sized to opts.message_size.
Reintroduce optional payload (keeps your current fast path by default):
- //using Result = std::vector<std::byte>; - using Result = size_t; + // Toggle to control payload size for result traffic + constexpr bool kUsePayload = true; + using Result = std::conditional_t<kUsePayload, std::vector<std::byte>, size_t>; @@ - //auto worker_task = [&opts](Task task) -> Result { - //return std::vector<std::byte>(opts.message_size, std::byte(task)); - auto worker_task = [](Task task) -> Result { - return task; - }; + auto worker_task = + [&opts](Task task) -> Result { + if constexpr (kUsePayload) { + return std::vector<std::byte>(opts.message_size, std::byte(task)); + } else { + return task; + } + };
109-112: Ensure statistics pointer is valid before dereferenceAssuming Detailed statistics mode guarantees a non-null worker_task_counts. If that invariant can break (e.g., if constructor changes), add a defensive check to avoid UB.
For a cheap guard:
- for (size_t i = 0; i < stats.worker_task_counts->size(); i++) { + DYNAMPI_ASSERT(stats.worker_task_counts && "worker_task_counts must be initialized"); + for (size_t i = 0; i < stats.worker_task_counts->size(); ++i) {If you want, I can scan the repo to confirm the invariant holds for track_statistics.
benchmark/pingpong.cpp (4)
360-367: Silence static analysis (CWE‑126) by avoiding strlen on a literalThe warning is a false positive for a string literal, but using sizeof on a char array removes it.
Apply:
- const char *header = - "src_rank,dst_rank,method,direction,locality,msg_bytes,iters,avg_rtt_seconds,latency_seconds,bandwidth_MBps,send_call_total_seconds\n"; - FILE *fp = std::fopen(opt.outfile.c_str(), "wb"); + static constexpr char header[] = + "src_rank,dst_rank,method,direction,locality,msg_bytes,iters,avg_rtt_seconds,latency_seconds,bandwidth_MBps,send_call_total_seconds\n"; + FILE *fp = std::fopen(opt.outfile.c_str(), "wb"); if (!fp) { std::cerr << "Failed to open output file: " << opt.outfile << std::endl; MPI_Abort(MPI_COMM_WORLD, 2); } - std::fwrite(header, 1, std::strlen(header), fp); + std::fwrite(header, 1, sizeof(header) - 1, fp);
55-118: Parsing routine is long and branchy; split for readability and testabilityparse_args weighs in at ~60 LOC with many branches. Extract “parse --methods” and “print_help_and_exit” helpers to reduce complexity and make unit testing easier.
If you want, I can propose a small refactor that drops cyclomatic complexity by ~50% while preserving behavior.
123-211: ping_once complexity is high; consider isolating “send step” and “recv step” pathsThe unified pattern is sound, but 80+ lines and many “if (method == …)” checks make it harder to modify. Extract “do_send” and “do_recv” lambdas or function templates parameterized by Method to simplify.
Happy to sketch a templated variant that compiles down the conditionals and shortens this to ~40 LOC.
341-357: Potential overflow with int recvcounts/displs on large CSVsFor very large sweeps, per-rank CSV chunks could exceed INT_MAX, truncating recvcounts/displs. Not urgent, but consider using MPI_Aint-sized counts with MPIX_Gatherv_x (if available) or chunking.
benchmark/CMakeLists.txt (1)
12-26: Optionally set an explicit C++ standard on the benchmarks.If dynampi doesn’t propagate CXX standard requirements transitively, these targets may build with an older default on some environments. Consider setting C++20 (or whatever dynampi requires) to avoid surprises.
foreach(benchmark IN LISTS benchmarks) add_executable(${benchmark} ${benchmark}.cpp ) + # Ensure the benchmark targets use the same C++ standard as the library + set_property(TARGET ${benchmark} PROPERTY CXX_STANDARD 20) target_link_libraries(${benchmark} PRIVATE dynampi cxxopts::cxxopts ) endforeach()include/dynampi/utilities/printing.hpp (1)
58-68: Avoid duplication: delegate std::array printing to the span overload.Leverage the span formatter to keep the delimiter logic in one place.
template <typename T, std::size_t N> inline std::ostream& operator<<(std::ostream& os, const std::array<T, N>& arr) { - os << "["; - for (std::size_t i = 0; i < arr.size(); ++i) { - os << arr[i]; - if (i < arr.size() - 1) { - os << ", "; - } - } - return os << "]"; + return os << std::span<const T, N>(arr); }test/unit/test_printing.cpp (1)
62-66: Consider a couple of edge-case tests.
- Empty set: EXPECT_EQ(to_str(std::set{}), "{}").
- Tuple with zero/one element: std::tuple<> prints "()", std::tuple{1} prints "(1)".
Also applies to: 95-98
include/dynampi/utilities/assert.hpp (2)
60-66: Optional: avoid calling MPI_Comm_rank before MPI_Init.If an assertion triggers before MPI is initialized, MPI_Comm_rank is not valid. Consider falling back to rank 0 when MPI_Initialized reports false.
int rank = 0; - MPI_Comm_rank(MPI_COMM_WORLD, &rank); + int inited = 0; + if (MPI_Initialized(&inited) == MPI_SUCCESS && inited) { + (void)MPI_Comm_rank(MPI_COMM_WORLD, &rank); + }Note: If you adopt this, you may want to stub MPI_Initialized in the test the same way MPI_Comm_rank is stubbed.
Also applies to: 73-75
11-15: Unused DYNAMPI_HAS_BUILTIN helper.DYNAMPI_HAS_BUILTIN is defined but never used in this header. Consider removing to reduce noise.
-#ifndef _MSC_VER -#define DYNAMPI_HAS_BUILTIN(x) __has_builtin(x) -#else -#define DYNAMPI_HAS_BUILTIN(x) 0 -#endiftest/unit/test_assert.cpp (1)
69-85: Destructor assert during throw is correctly suppressed — consider adding a comment in header.The test relies on std::uncaught_exceptions() > 0 to suppress assertions during unwinding. Consider documenting this behavior in assert.hpp to make the intentional suppression explicit.
test/CMakeLists.txt (1)
94-96: Gate multi-rank MPI tests on MPI availability to avoid empty test commands in non-MPI environmentsIf MPI wasn’t discovered (MPIEXEC_EXECUTABLE unset), the current unconditional loop will register tests whose command starts with an empty program, which CTest treats poorly. Wrap the loop with an MPI check.
Apply this diff:
-foreach(rank 1 2 3 4 8 16) - add_mpi_test(mpi_test_${rank}_rank ${rank}) -endforeach() +if(MPIEXEC_EXECUTABLE) + foreach(rank 1 2 3 4 8 16) + add_mpi_test(mpi_test_${rank}_rank ${rank}) + endforeach() +endif()Optional: make the 16-rank test opt-in in CI via a cache var (e.g., DYNAMPI_ENABLE_16_RANK_TESTS) to reduce oversubscription flakiness on small runners.
include/dynampi/impl/naive_distributor.hpp (5)
101-121: Use a symmetric “send_empty_message” for clarity and consistencyWorkers signal readiness with
send(nullptr, ...)while managers userecv_empty_message(...). Consider addingMPICommunicator::send_empty_message(dest, tag)and using it here for symmetry and readability.Example change in this file (after you add the helper to MPICommunicator):
- m_communicator.send(nullptr, m_config.manager_rank, Tag::REQUEST); + m_communicator.send_empty_message(m_config.manager_rank, Tag::REQUEST);
161-181: Fairness: LIFO free-worker stack biases towards the most recently freed workerUsing
std::stackbiases scheduling; some workers can be underutilized under heavy load. If fairness matters, prefer FIFO (queue) or a policy toggle.- std::stack<int, std::vector<int>> m_free_worker_indices; + // Prefer FIFO to avoid bias; adjust push/pop sites accordingly + std::queue<int> m_free_worker_indices;And replace
top()/pop()withfront()/pop().
248-257: Prefer wrapperprobe()over raw MPI_Probe for consistencyYou use
m_communicator.probe()on the worker side but rawMPI_Probeon the manager. Use the wrapper here, too, to keep the surface consistent and to simplify future instrumentation.- MPI_Status status; - DYNAMPI_MPI_CHECK(MPI_Probe, (MPI_ANY_SOURCE, MPI_ANY_TAG, m_communicator.get(), &status)); + MPI_Status status = m_communicator.probe();
258-263: Use idx_for_worker() instead of re-deriving the index mathThe manual
status.MPI_SOURCE - (status.MPI_SOURCE > m_config.manager_rank)duplicatesidx_for_worker(). Centralize to avoid drift.- int64_t task_idx = m_worker_current_task_indices[status.MPI_SOURCE - - (status.MPI_SOURCE > m_config.manager_rank)]; - m_worker_current_task_indices[status.MPI_SOURCE - - (status.MPI_SOURCE > m_config.manager_rank)] = -1; + const auto idx = idx_for_worker(status.MPI_SOURCE); + int64_t task_idx = m_worker_current_task_indices[idx]; + m_worker_current_task_indices[idx] = -1;
183-195: Finalize contract and reentrancy
finish_remaining_tasks()returns accumulated results without clearing. Tests rely on this, but it’s worth documenting this “cumulative” contract (second call includes previous results) to avoid surprises if future changes move toward “flush semantics.”test/mpi/test_distributers.cpp (2)
117-143: Conditional skip for hierarchical distributor is reasonableSkipping Example2 for hierarchical semantics avoids false failures while the batch-based behavior differs. Consider adding a targeted hierarchical-specific test that validates its semantics separately.
185-213: Statistics assertions: bytes_sent unconditional check may be brittle across distributor strategiesFor non-Naive distributors, internal topology (e.g., forwarding through intermediates) could affect observed byte counts on the root. You already guard send/recv counts for Naive only; consider guarding bytes_sent similarly or deriving expectations per distributor.
Would you like me to split this test into per-distributor expectations so Hierarchical can assert its own message/byte accounting?
include/dynampi/impl/hierarchical_distributor.hpp (5)
80-89: total_num_children() undercounts by excluding direct children.Currently, only descendants are counted; direct children are not added. This yields 0 for a node with only leaves beneath it. If you intend a full subtree size, include each direct child in the sum.
inline int total_num_children(int rank) const { int virtual_rank = rank == m_config.manager_rank ? 0 : idx_for_worker(rank) + 1; int num_children = 0; for (int i = 0; i < m_config.max_workers_per_coordinator; ++i) { int child = virtual_rank * m_config.max_workers_per_coordinator + i + 1; if (child >= m_communicator.size()) break; // No more children - num_children += total_num_children(worker_for_idx(child - 1)); + // Count the direct child, plus its descendants + num_children += 1 + total_num_children(worker_for_idx(child - 1)); } return num_children; }
212-220: Method name is misleading; either send an explicit request or rename.return_results_and_request_next_batch_from_manager() only sends RESULT_BATCH and relies on the parent’s receive_result_batch_from() to implicitly treat results as a request. That coupling is fine, but the name suggests an explicit REQUEST/REQUEST_BATCH is also emitted.
Two options:
- Rename to return_results_to_parent() to reflect actual behavior.
- Or, explicitly request the next batch right after sending results (but then remove the implicit “results-as-request” behavior on the parent to avoid double requests).
If you prefer a rename, minimal change:
- void return_results_and_request_next_batch_from_manager() { + void return_results_to_parent() { DYNAMPI_ASSERT(!is_leaf_worker(), "Leaf workers should not return results directly"); ... }And its call site:
- return_results_and_request_next_batch_from_manager(); + return_results_to_parent();Would you like me to propagate the rename across this file?
296-318: Avoid overloading m_results_sent_to_parent for manager-only invariants.Setting m_results_sent_to_parent = m_results.size() on the manager blurs the “sent to parent” semantic to make assertions pass. This is surprising and fragile for future maintainers.
Suggestion:
- Do not mutate m_results_sent_to_parent here (manager has no parent).
- Assert manager-specific invariants directly in terms of m_results.size(), m_results_received_from_child, and m_tasks_received_from_parent.
Example:
- m_results_sent_to_parent = m_results.size(); DYNAMPI_ASSERT_EQ(m_results_received_from_child, m_tasks_sent_to_child, "All tasks should have been processed by workers before finalizing"); - DYNAMPI_ASSERT_EQ(m_results_sent_to_parent, m_tasks_received_from_parent, - "All results should have been sent to the parent before finalizing"); - if (m_communicator.size() > 1) - DYNAMPI_ASSERT_EQ(m_results_sent_to_parent, m_results_received_from_child + m_tasks_executed, - "Manager should not send results to itself"); + DYNAMPI_ASSERT_EQ(m_results.size(), m_tasks_received_from_parent, + "Manager: aggregated results must match tasks inserted"); + if (m_communicator.size() > 1) + DYNAMPI_ASSERT_EQ(m_results.size(), m_results_received_from_child + m_tasks_executed, + "Manager should not execute child work");If desired, I can adjust the destructor’s assertions similarly to preserve clear semantics.
456-491: Add a default case in the tag switch to catch unexpected tags.A defensive default case will make issues more diagnosable if new tags are introduced or mismatches occur.
switch (status.MPI_TAG) { case Tag::TASK: { return receive_execute_return_task_from(status); } case Tag::TASK_BATCH: { return receive_task_batch_from(status); } case Tag::RESULT: { return receive_result_from(status); } case Tag::RESULT_BATCH: { return receive_result_batch_from(status); } case Tag::REQUEST: { m_communicator.recv_empty_message(status.MPI_SOURCE, Tag::REQUEST); m_free_worker_indices.push(TaskRequest{.worker_rank = status.MPI_SOURCE}); return; } case Tag::REQUEST_BATCH: { int request_count; m_communicator.recv(request_count, status.MPI_SOURCE, Tag::REQUEST_BATCH); m_free_worker_indices.push( TaskRequest{.worker_rank = status.MPI_SOURCE, .num_tasks_requested = request_count}); return; } case Tag::DONE: { m_communicator.recv_empty_message(status.MPI_SOURCE, Tag::DONE); m_done = true; return; } + default: { + DYNAMPI_UNIMPLEMENTED("Unknown MPI tag in hierarchical distributor"); + } }
52-69: Naming nit: m_free_worker_indices stores TaskRequest, not indices.Consider renaming to m_free_worker_requests or m_ready_workers to reflect content. This improves readability.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (17)
.github/workflows/sanitizers.yml(1 hunks).gitignore(1 hunks)benchmark/CMakeLists.txt(1 hunks)benchmark/asymptotic_distribution_throughput.cpp(2 hunks)benchmark/pingpong.cpp(1 hunks)include/dynampi/impl/hierarchical_distributor.hpp(4 hunks)include/dynampi/impl/naive_distributor.hpp(4 hunks)include/dynampi/mpi/mpi_communicator.hpp(7 hunks)include/dynampi/mpi/mpi_types.hpp(1 hunks)include/dynampi/utilities/assert.hpp(1 hunks)include/dynampi/utilities/printing.hpp(1 hunks)test/CMakeLists.txt(1 hunks)test/lsan.supp(1 hunks)test/mpi/test_distributers.cpp(8 hunks)test/mpi/test_mpi_wrapper.cpp(1 hunks)test/unit/test_assert.cpp(1 hunks)test/unit/test_printing.cpp(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (9)
test/mpi/test_mpi_wrapper.cpp (2)
include/dynampi/mpi/mpi_communicator.hpp (1)
rank(132-136)test/mpi/mpi_test_environment.hpp (1)
rank(24-28)
include/dynampi/utilities/assert.hpp (2)
include/dynampi/impl/hierarchical_distributor.hpp (14)
DYNAMPI_ASSERT(168-210)DYNAMPI_ASSERT(212-220)DYNAMPI_ASSERT(320-336)DYNAMPI_ASSERT(355-368)rank(70-78)rank(80-89)rank(80-80)rank(91-101)rank(91-91)rank(103-112)rank(114-117)DYNAMPI_ASSERT_GT(456-491)DYNAMPI_ASSERT_EQ(224-228)DYNAMPI_ASSERT_EQ(370-382)test/unit/test_assert.cpp (1)
DYNAMPI_ASSERT_LT(122-122)
benchmark/asymptotic_distribution_throughput.cpp (2)
include/dynampi/impl/hierarchical_distributor.hpp (4)
task(230-237)task(230-231)task(238-245)task(238-239)include/dynampi/impl/naive_distributor.hpp (5)
task(132-138)task(132-133)task(139-145)task(139-140)task(161-181)
benchmark/pingpong.cpp (1)
test/mpi/mpi_test_environment.hpp (2)
MPI_Finalize(16-16)MPI_Init(14-14)
test/unit/test_assert.cpp (2)
include/dynampi/impl/hierarchical_distributor.hpp (13)
rank(70-78)rank(80-89)rank(80-80)rank(91-101)rank(91-91)rank(103-112)rank(114-117)DYNAMPI_ASSERT(168-210)DYNAMPI_ASSERT(212-220)DYNAMPI_ASSERT(320-336)DYNAMPI_ASSERT(355-368)DYNAMPI_ASSERT_EQ(224-228)DYNAMPI_ASSERT_EQ(370-382)include/dynampi/mpi/mpi_communicator.hpp (1)
rank(132-136)
include/dynampi/impl/hierarchical_distributor.hpp (3)
include/dynampi/utilities/template_options.hpp (2)
get_option_value(27-27)value(27-29)include/dynampi/mpi/mpi_communicator.hpp (4)
rank(132-136)statistics_mode(126-130)statistics_mode(126-127)nodiscard(215-215)include/dynampi/impl/naive_distributor.hpp (19)
m_communicator(124-124)comm(70-76)comm(70-70)statistics_mode(94-99)statistics_mode(94-95)task(132-138)task(132-133)task(139-145)task(139-140)task(161-181)tasks(155-159)tasks(155-156)insert_tasks(149-154)insert_tasks(149-149)nodiscard(183-195)worker_rank(237-245)worker_rank(237-237)idx(247-247)idx(247-247)
test/mpi/test_distributers.cpp (2)
include/dynampi/impl/hierarchical_distributor.hpp (4)
task(230-237)task(230-231)task(238-245)task(238-239)include/dynampi/impl/naive_distributor.hpp (5)
task(132-138)task(132-133)task(139-145)task(139-140)task(161-181)
include/dynampi/mpi/mpi_communicator.hpp (2)
include/dynampi/impl/hierarchical_distributor.hpp (17)
comm(140-146)comm(140-140)rank(70-78)rank(80-89)rank(80-80)rank(91-101)rank(91-91)rank(103-112)rank(114-117)status(399-411)status(399-399)status(413-424)status(413-413)status(426-437)status(426-426)status(439-454)status(439-439)include/dynampi/impl/naive_distributor.hpp (8)
comm(70-76)comm(70-70)assert(101-122)assert(126-130)assert(197-203)assert(213-225)assert(227-235)assert(249-276)
include/dynampi/impl/naive_distributor.hpp (3)
include/dynampi/utilities/template_options.hpp (2)
get_option_value(27-27)value(27-29)include/dynampi/impl/hierarchical_distributor.hpp (13)
m_communicator(222-222)m_communicator(263-294)task(230-237)task(230-231)task(238-245)task(238-239)tasks(257-261)tasks(257-258)nodiscard(296-318)worker_rank(384-392)worker_rank(384-384)idx(394-394)idx(394-394)include/dynampi/mpi/mpi_communicator.hpp (1)
nodiscard(215-215)
🪛 GitHub Check: Codacy Static Code Analysis
benchmark/pingpong.cpp
[warning] 55-55: benchmark/pingpong.cpp#L55
Method parse_args has 62 lines of code (limit is 50)
[warning] 55-55: benchmark/pingpong.cpp#L55
Method parse_args has a cyclomatic complexity of 24 (limit is 8)
[warning] 123-123: benchmark/pingpong.cpp#L123
Method ping_once has 81 lines of code (limit is 50)
[warning] 123-123: benchmark/pingpong.cpp#L123
Method ping_once has a cyclomatic complexity of 34 (limit is 8)
[warning] 213-213: benchmark/pingpong.cpp#L213
Method main has 128 lines of code (limit is 50)
[warning] 213-213: benchmark/pingpong.cpp#L213
Method main has a cyclomatic complexity of 25 (limit is 8)
[failure] 364-364: benchmark/pingpong.cpp#L364
Does not handle strings that are not \0-terminated; if given one it may perform an over-read (it could cause a crash if unprotected) (CWE-126).
🪛 GitHub Actions: Linux GCC
benchmark/pingpong.cpp
[error] 80-80: Command failed: cmake --build build --config Release --parallel. Build failed: error: 'm' may be used uninitialized in parse_args() (pingpong.cpp:80). cc1plus: all warnings being treated as errors.
🪛 GitHub Actions: Pre-commit
benchmark/pingpong.cpp
[error] 338-338: BA should be BY, BE
[error] 356-356: ba should be by, be
[error] 360-360: ba should be by, be
[error] 360-360: BA should be BY, BE
[error] 361-361: BA should be BY, BE
[error] 366-366: BA should be BY, BE
[error] 367-367: ba should be by, be
[error] 1-1: REUSE: Missing copyright and licensing information in benchmark/pingpong.cpp.
benchmark/CMakeLists.txt
[error] 1-1: clang-format: formatting changes were applied to benchmark/CMakeLists.txt.
[error] 1-1: trailing-whitespace: trailing whitespace detected and removed in benchmark/CMakeLists.txt.
include/dynampi/impl/hierarchical_distributor.hpp
[error] 1-1: clang-format: formatting changes were made by the pre-commit hook.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: linux-clang (18, Debug)
- GitHub Check: linux-clang (15, Release)
- GitHub Check: linux-clang (18, Release)
- GitHub Check: linux-clang (15, Debug)
- GitHub Check: linux-clang (16, Release)
- GitHub Check: linux-clang (17, Debug)
- GitHub Check: linux-clang (16, Debug)
- GitHub Check: linux-clang (17, Release)
- GitHub Check: linux-clang (14, Release)
- GitHub Check: linux-clang (14, Debug)
- GitHub Check: Windows Intel MPI (MPI 4.0) (Debug)
- GitHub Check: Windows MinGW (Release)
- GitHub Check: Windows Intel MPI (MPI 4.0) (Release)
- GitHub Check: Windows MinGW (Debug)
- GitHub Check: linux-debug-sanitizers (undefined)
- GitHub Check: codecov
- GitHub Check: linux-intel (Release)
- GitHub Check: linux-intel (Debug)
🔇 Additional comments (21)
include/dynampi/mpi/mpi_types.hpp (1)
12-12: Good addition: include belongs here.This makes the std::string specialization self-sufficient and removes reliance on transitive includes. No concerns.
.gitignore (1)
11-12: LGTM: ignore patterns for crash dumps and CSV artifacts make senseIgnoring core dumps and benchmark CSV outputs aligns with the new benchmarks and avoids noisy diffs.
benchmark/CMakeLists.txt (2)
12-26: Nice generalization of benchmark targets.The foreach-based pattern is clean and makes it trivial to add new benchmarks by editing the list once. Linking to dynampi and cxxopts per target is correct.
12-26: Ensure formatting fixes are committedIt looks like the
pre-commitCLI isn’t available in your local environment, so the CI-applied formatting changes aren’t visible yet. Please manually apply and commit the formatting fixes tobenchmark/CMakeLists.txt:
- Run clang-format against the project style:
clang-format -i -style=file benchmark/CMakeLists.txt- Strip any trailing whitespace:
grep -R -nP "[ \t]+$" benchmark/CMakeLists.txt | cut -d: -f1 | uniq | xargs -I{} sed -i 's/[ \t]*$//' benchmark/CMakeLists.txt- Stage and commit the resulting diff so CI checks (clang-format and whitespace) pass green.
test/unit/test_printing.cpp (1)
30-98: Solid coverage of printing utilities.Good spread across byte, span, vector delegation, array, set, pair, optional, nested containers, and tuple. Nicely exercises the operator<< overloads.
include/dynampi/utilities/assert.hpp (1)
75-86: Binary-op assertion helpers: approach looks good.Single-evaluation via local A/B copies avoids side effects; the negated-op message content is clear and leverages the printing utilities.
Also applies to: 96-114
test/unit/test_assert.cpp (2)
11-18: MPI rank stubbing is well-placed.Redefining MPI_Comm_rank before including the header-under-test guarantees deterministic rank output for checks.
Also applies to: 20-22
115-123: Release-path “no-op” test verifies macro erasure — keep it.The intentionally non-C++ tokens in DYNAMPI_ASSERT_LT(...) within the NDEBUG path validate that the macro truly erases arguments. This is a clever guard against accidental evaluation in release builds.
test/mpi/test_mpi_wrapper.cpp (1)
65-66: Good additions: validate detailed averages on send/recv pathsThe assertions correctly reflect the Detailed mode semantics: sender’s average_send_size() equals sizeof(int) with zero receives, and vice versa on the receiver. This tightens coverage without altering flow.
Also applies to: 71-72
include/dynampi/impl/naive_distributor.hpp (1)
34-34: Public ordered flag is fine; confirms deterministic result expectation in testsExposing
static const bool ordered = true;aligns with the new test gates and documents the distributor’s deterministic emission order.include/dynampi/mpi/mpi_communicator.hpp (2)
157-161: Nice: probe() wrapper improves ergonomics and parity with other opsThe
probe()addition reduces boilerplate and harmonizes call sites. Good change.
101-105: Destructor safety on moved-from communicatorsThe guard
m_ownership != Referenceprevents freeing a moved-from (null) communicator. This is correct and avoidsMPI_Comm_free(MPI_COMM_NULL).test/mpi/test_distributers.cpp (5)
19-24: Trait utility looks correct and unlocks distributor-specific test paths
is_specialization_ofis implemented correctly and keeps type-based branches neat.
39-41: Reordering type list is fine; exercises both distributors consistentlyNo behavioral risk; increases coverage symmetry.
55-56: Good sanity check: manager rank alignmentThis guards against misconfigured manager rank when running tests.
64-66: Determinism guard for unordered distributorsSorting when
!orderedstabilizes assertions. Nice.
149-152: Gate PriorityQueue test on ordering — good guardThis prevents nondeterministic failures on distributors that don’t promise order.
include/dynampi/impl/hierarchical_distributor.hpp (4)
30-42: ordered flag is added; confirm external expectations.Static const bool ordered = false alters behavior expected by tests/consumers. Ensure any ordering-sensitive tests toggle based on this flag, as was done for the naive distributor.
Do you want me to scan the repo to confirm tests check this flag and don’t assume ordering for the hierarchical distributor?
371-381: Finalization assumes all direct children are “free”; precondition ensured above is good.The explicit equality check between m_free_worker_indices.size() and num_direct_children() is a good guard before broadcasting DONE. Nice.
148-159: Constructor behavior is clear; auto_run_workers gate looks good.Duplicating the communicator and the conditional worker auto-run wiring are appropriate. Empty else is fine.
1-495: Align code formatting with pre-commit/clang-formatCI is failing on formatting in
include/dynampi/impl/hierarchical_distributor.hpp. Please verify and fix formatting by running your project’s pre-commit hooks or clang-format locally before pushing:• Install and run pre-commit hooks:
pre-commit install pre-commit run -a• Or format just this file:
clang-format -i include/dynampi/impl/hierarchical_distributor.hppRe-run CI and commit any resulting changes to resolve the formatting errors.
| @@ -0,0 +1,373 @@ | |||
| // mpi_pair_bench.cpp | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add SPDX license header to satisfy REUSE check
The CI reports a REUSE failure for this file. Mirror headers used elsewhere.
Apply at file top:
+// SPDX-FileCopyrightText: 2025 QDX Technologies. Authored by Ryan Stocks <[email protected]>
+// SPDX-License-Identifier: MIT
// mpi_pair_bench.cpp📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // mpi_pair_bench.cpp | |
| // SPDX-FileCopyrightText: 2025 QDX Technologies. Authored by Ryan Stocks <[email protected]> | |
| // SPDX-License-Identifier: MIT | |
| // mpi_pair_bench.cpp |
🧰 Tools
🪛 GitHub Actions: Pre-commit
[error] 1-1: REUSE: Missing copyright and licensing information in benchmark/pingpong.cpp.
🤖 Prompt for AI Agents
In benchmark/pingpong.cpp around line 1, the file is missing the SPDX license
header required by REUSE; add the same SPDX header block used in other project
source files at the very top of this file (e.g., SPDX-License-Identifier:
Apache-2.0 or the project's chosen identifier and any copyright owner lines) so
the header mirrors the repository's existing license headers and satisfies the
REUSE check.
| else if (a == "--methods") { | ||
| need("--methods"); | ||
| methods_specified = true; | ||
| opt.methods.clear(); | ||
| std::string list = argv[++i]; | ||
| size_t start = 0; | ||
| while (start <= list.size()) { | ||
| size_t comma = list.find(',', start); | ||
| std::string tok = (comma == std::string::npos) ? list.substr(start) : list.substr(start, comma - start); | ||
| Method m; | ||
| if (!parse_method(tok, m)) die(rank, "unknown method in --methods: " + tok); | ||
| opt.methods.push_back(m); | ||
| if (comma == std::string::npos) break; | ||
| start = comma + 1; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix -Wmaybe-uninitialized on local variable ‘m’ in parse_args()
GCC treats m as potentially uninitialized when pushed into opt.methods, tripping -Werror in CI.
Apply this localized initialization:
- Method m;
+ Method m = Method::SEND; // default to silence -Wmaybe-uninitialized
if (!parse_method(tok, m)) die(rank, "unknown method in --methods: " + tok);
opt.methods.push_back(m);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| else if (a == "--methods") { | |
| need("--methods"); | |
| methods_specified = true; | |
| opt.methods.clear(); | |
| std::string list = argv[++i]; | |
| size_t start = 0; | |
| while (start <= list.size()) { | |
| size_t comma = list.find(',', start); | |
| std::string tok = (comma == std::string::npos) ? list.substr(start) : list.substr(start, comma - start); | |
| Method m; | |
| if (!parse_method(tok, m)) die(rank, "unknown method in --methods: " + tok); | |
| opt.methods.push_back(m); | |
| if (comma == std::string::npos) break; | |
| start = comma + 1; | |
| } | |
| } | |
| else if (a == "--methods") { | |
| need("--methods"); | |
| methods_specified = true; | |
| opt.methods.clear(); | |
| std::string list = argv[++i]; | |
| size_t start = 0; | |
| while (start <= list.size()) { | |
| size_t comma = list.find(',', start); | |
| std::string tok = (comma == std::string::npos) | |
| ? list.substr(start) | |
| : list.substr(start, comma - start); | |
| Method m = Method::SEND; // default to silence -Wmaybe-uninitialized | |
| if (!parse_method(tok, m)) | |
| die(rank, "unknown method in --methods: " + tok); | |
| opt.methods.push_back(m); | |
| if (comma == std::string::npos) break; | |
| start = comma + 1; | |
| } | |
| } |
🧰 Tools
🪛 GitHub Actions: Linux GCC
[error] 80-80: Command failed: cmake --build build --config Release --parallel. Build failed: error: 'm' may be used uninitialized in parse_args() (pingpong.cpp:80). cc1plus: all warnings being treated as errors.
🤖 Prompt for AI Agents
In benchmark/pingpong.cpp around lines 71 to 86, GCC warns that local variable
'm' may be uninitialized before being pushed into opt.methods; to fix it, change
the declaration to an explicit value-initialized Method (e.g. default-construct
it) so m is always initialized before parse_method() and before
opt.methods.push_back(m), keeping the rest of the parsing loop the same.
| const int TAG_BA_RESULT = 88001; | ||
|
|
||
| // Main sweep: pairs × sizes × methods | ||
| for (int a = 0; a < world; ++a) { | ||
| for (int b = a + 1; b < world; ++b) { | ||
| if (!pair_is_enabled(a, b)) continue; | ||
|
|
||
| const bool same_node = (rank_name(a) == rank_name(b)); | ||
| const char *locality = same_node ? "intranode" : "internode"; | ||
|
|
||
| for (std::size_t bytes : sizes) { | ||
| for (Method m : opt.methods) { | ||
| // a->b | ||
| MPI_Barrier(MPI_COMM_WORLD); | ||
| PingResult rtt_ab = ping_once(a, b, me, bytes, opt.warmup, opt.iters, buffer, m); | ||
|
|
||
| // b->a | ||
| MPI_Barrier(MPI_COMM_WORLD); | ||
| PingResult rtt_ba = ping_once(b, a, me, bytes, opt.warmup, opt.iters, buffer, m); | ||
|
|
||
| // Ship b->a sender's measurement to logger (rank a) | ||
| if (me == b) { | ||
| double payload[2] = { rtt_ba.avg_rtt_s, rtt_ba.send_call_total_s }; | ||
| MPI_Send(payload, 2, MPI_DOUBLE, a, TAG_BA_RESULT, MPI_COMM_WORLD); | ||
| } | ||
|
|
||
| if (me == a) { | ||
| double payload[2]; | ||
| MPI_Recv(payload, 2, MPI_DOUBLE, b, TAG_BA_RESULT, MPI_COMM_WORLD, MPI_STATUS_IGNORE); | ||
| PingResult rtt_ba_from_b{ payload[0], payload[1] }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolve pre-commit/codespell “BA/by/be” false positives by renaming tag
The constant/tag name “BA” appears to trigger codespell. Renaming keeps meaning while placating the hook.
Apply:
- const int TAG_BA_RESULT = 88001;
+ const int TAG_B2A_RESULT = 88001;
@@
- MPI_Send(payload, 2, MPI_DOUBLE, a, TAG_BA_RESULT, MPI_COMM_WORLD);
+ MPI_Send(payload, 2, MPI_DOUBLE, a, TAG_B2A_RESULT, MPI_COMM_WORLD);
@@
- MPI_Recv(payload, 2, MPI_DOUBLE, b, TAG_BA_RESULT, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
+ MPI_Recv(payload, 2, MPI_DOUBLE, b, TAG_B2A_RESULT, MPI_COMM_WORLD, MPI_STATUS_IGNORE);If the hook is also flagging the literal “b->a” strings in CSV, we can either:
- Keep them (preferred for clarity) and add a codespell ignore list for “b->a, BA, B2A”, or
- Change the CSV direction field to “A_TO_B” / “B_TO_A”.
I can open a small follow-up PR to add a repo-level codespell ignore if you prefer the first option.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const int TAG_BA_RESULT = 88001; | |
| // Main sweep: pairs × sizes × methods | |
| for (int a = 0; a < world; ++a) { | |
| for (int b = a + 1; b < world; ++b) { | |
| if (!pair_is_enabled(a, b)) continue; | |
| const bool same_node = (rank_name(a) == rank_name(b)); | |
| const char *locality = same_node ? "intranode" : "internode"; | |
| for (std::size_t bytes : sizes) { | |
| for (Method m : opt.methods) { | |
| // a->b | |
| MPI_Barrier(MPI_COMM_WORLD); | |
| PingResult rtt_ab = ping_once(a, b, me, bytes, opt.warmup, opt.iters, buffer, m); | |
| // b->a | |
| MPI_Barrier(MPI_COMM_WORLD); | |
| PingResult rtt_ba = ping_once(b, a, me, bytes, opt.warmup, opt.iters, buffer, m); | |
| // Ship b->a sender's measurement to logger (rank a) | |
| if (me == b) { | |
| double payload[2] = { rtt_ba.avg_rtt_s, rtt_ba.send_call_total_s }; | |
| MPI_Send(payload, 2, MPI_DOUBLE, a, TAG_BA_RESULT, MPI_COMM_WORLD); | |
| } | |
| if (me == a) { | |
| double payload[2]; | |
| MPI_Recv(payload, 2, MPI_DOUBLE, b, TAG_BA_RESULT, MPI_COMM_WORLD, MPI_STATUS_IGNORE); | |
| PingResult rtt_ba_from_b{ payload[0], payload[1] }; | |
| // Rename TAG_BA_RESULT to TAG_B2A_RESULT to avoid codespell “BA” false positives | |
| const int TAG_B2A_RESULT = 88001; | |
| // Main sweep: pairs × sizes × methods | |
| for (int a = 0; a < world; ++a) { | |
| for (int b = a + 1; b < world; ++b) { | |
| if (!pair_is_enabled(a, b)) continue; | |
| const bool same_node = (rank_name(a) == rank_name(b)); | |
| const char *locality = same_node ? "intranode" : "internode"; | |
| for (std::size_t bytes : sizes) { | |
| for (Method m : opt.methods) { | |
| // a->b | |
| MPI_Barrier(MPI_COMM_WORLD); | |
| PingResult rtt_ab = ping_once(a, b, me, bytes, opt.warmup, opt.iters, buffer, m); | |
| // b->a | |
| MPI_Barrier(MPI_COMM_WORLD); | |
| PingResult rtt_ba = ping_once(b, a, me, bytes, opt.warmup, opt.iters, buffer, m); | |
| // Ship b->a sender's measurement to logger (rank a) | |
| if (me == b) { | |
| double payload[2] = { rtt_ba.avg_rtt_s, rtt_ba.send_call_total_s }; | |
| MPI_Send(payload, 2, MPI_DOUBLE, a, TAG_B2A_RESULT, MPI_COMM_WORLD); | |
| } | |
| if (me == a) { | |
| double payload[2]; | |
| MPI_Recv(payload, 2, MPI_DOUBLE, b, TAG_B2A_RESULT, MPI_COMM_WORLD, MPI_STATUS_IGNORE); | |
| PingResult rtt_ba_from_b{ payload[0], payload[1] }; |
| void allocate_task_to_child() { | ||
| if (m_communicator.size() > 1) { | ||
| if (m_free_worker_indices.empty()) { | ||
| // If no free workers, wait for a result to be received | ||
| receive_from_any_worker(); | ||
| receive_from_anyone(); | ||
| } | ||
| int worker = _free_worker_indices.top(); | ||
| _free_worker_indices.pop(); | ||
| _worker_current_task_indices[idx_for_worker(worker)] = _tasks_sent; | ||
| if constexpr (statistics_mode >= StatisticsMode::Aggregated) { | ||
| _statistics.worker_task_counts[worker]++; | ||
| TaskRequest request = m_free_worker_indices.top(); | ||
| int worker = request.worker_rank; | ||
| m_free_worker_indices.pop(); | ||
| if (request.num_tasks_requested.has_value()) { | ||
| std::vector<TaskT> tasks; | ||
| tasks.reserve(request.num_tasks_requested.value()); | ||
| for (int i = 0; i < request.num_tasks_requested; ++i) { | ||
| if (m_unallocated_task_queue.empty()) { | ||
| break; // No more tasks to allocate | ||
| } | ||
| tasks.push_back(get_next_task_to_send()); | ||
| } | ||
| m_communicator.send(tasks, worker, Tag::TASK_BATCH); | ||
| m_tasks_sent_to_child += tasks.size(); | ||
| } else { | ||
| const TaskT task = get_next_task_to_send(); | ||
| m_communicator.send(task, worker, Tag::TASK); | ||
| m_tasks_sent_to_child++; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use std::optional::value() (or operator) and harden free-worker wait.*
- Bug: comparing loop variable to std::optional is ill-formed. Use value() (or *).
- Robustness: if no free workers are available after a single receive_from_anyone(), top() will still underflow. Use a loop to wait.
void allocate_task_to_child() {
if (m_communicator.size() > 1) {
- if (m_free_worker_indices.empty()) {
- // If no free workers, wait for a result to be received
- receive_from_anyone();
- }
+ // If no free workers, block until one becomes available
+ while (m_free_worker_indices.empty()) {
+ receive_from_anyone();
+ }
TaskRequest request = m_free_worker_indices.top();
int worker = request.worker_rank;
m_free_worker_indices.pop();
if (request.num_tasks_requested.has_value()) {
std::vector<TaskT> tasks;
- tasks.reserve(request.num_tasks_requested.value());
- for (int i = 0; i < request.num_tasks_requested; ++i) {
+ const int n = request.num_tasks_requested.value();
+ tasks.reserve(n);
+ for (int i = 0; i < n; ++i) {
if (m_unallocated_task_queue.empty()) {
break; // No more tasks to allocate
}
tasks.push_back(get_next_task_to_send());
}
m_communicator.send(tasks, worker, Tag::TASK_BATCH);
m_tasks_sent_to_child += tasks.size();Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In include/dynampi/impl/hierarchical_distributor.hpp around lines 263-287, the
code erroneously compares the loop index to a std::optional<int> and only calls
receive_from_anyone() once which can still leave m_free_worker_indices empty and
cause top()/pop() underflow; fix by using request.num_tasks_requested.value()
(or *request.num_tasks_requested) for the for-loop bound, and replace the single
receive_from_anyone() call with a loop that waits until m_free_worker_indices is
non-empty (e.g., while(m_free_worker_indices.empty()) receive_from_anyone();)
before accessing top()/pop().
| if constexpr (prioritize_tasks) { | ||
| DYNAMPI_UNIMPLEMENTED("Prioritized hierarchical distribution"); | ||
| } else { | ||
| assert(status.MPI_TAG == Tag::REQUEST && "Unexpected tag received in worker"); | ||
| _communicator.recv_empty_message(status.MPI_SOURCE, Tag::REQUEST); | ||
| using message_type = MPI_Type<std::vector<TaskT>>; | ||
| int count; | ||
| DYNAMPI_MPI_CHECK(MPI_Get_count, (&status, message_type::value, &count)); | ||
| std::vector<TaskT> tasks; | ||
| message_type::resize(tasks, count); | ||
| m_communicator.recv(tasks, parent_rank(), Tag::TASK_BATCH); | ||
| m_tasks_received_from_parent += tasks.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Receive from the probed source, not parent_rank().
Receiving a TASK_BATCH should use the probed status.MPI_SOURCE. Using parent_rank() is brittle and can misroute if assumptions change; it also deviates from the pattern used elsewhere here.
- m_communicator.recv(tasks, parent_rank(), Tag::TASK_BATCH);
+ m_communicator.recv(tasks, status.MPI_SOURCE, Tag::TASK_BATCH);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if constexpr (prioritize_tasks) { | |
| DYNAMPI_UNIMPLEMENTED("Prioritized hierarchical distribution"); | |
| } else { | |
| assert(status.MPI_TAG == Tag::REQUEST && "Unexpected tag received in worker"); | |
| _communicator.recv_empty_message(status.MPI_SOURCE, Tag::REQUEST); | |
| using message_type = MPI_Type<std::vector<TaskT>>; | |
| int count; | |
| DYNAMPI_MPI_CHECK(MPI_Get_count, (&status, message_type::value, &count)); | |
| std::vector<TaskT> tasks; | |
| message_type::resize(tasks, count); | |
| m_communicator.recv(tasks, parent_rank(), Tag::TASK_BATCH); | |
| m_tasks_received_from_parent += tasks.size(); | |
| if constexpr (prioritize_tasks) { | |
| DYNAMPI_UNIMPLEMENTED("Prioritized hierarchical distribution"); | |
| } else { | |
| using message_type = MPI_Type<std::vector<TaskT>>; | |
| int count; | |
| DYNAMPI_MPI_CHECK(MPI_Get_count, (&status, message_type::value, &count)); | |
| std::vector<TaskT> tasks; | |
| message_type::resize(tasks, count); | |
| m_communicator.recv(tasks, status.MPI_SOURCE, Tag::TASK_BATCH); | |
| m_tasks_received_from_parent += tasks.size(); |
🤖 Prompt for AI Agents
In include/dynampi/impl/hierarchical_distributor.hpp around lines 440 to 449,
the recv call uses parent_rank() but should receive from the probed source
stored in status; replace parent_rank() with the probed source (e.g.,
status.MPI_SOURCE or the equivalent accessor used elsewhere) so the TASK_BATCH
is received from status.MPI_SOURCE, keeping the rest of the logic (count,
resize, m_communicator.recv, and m_tasks_received_from_parent update) unchanged.
| template <typename T> | ||
| inline void gather(const T& data, std::vector<T>* result, int root = 0) { | ||
| DYNAMPI_ASSERT_EQ(result != nullptr, root == rank(), | ||
| "Gather result must be provided only on the root rank"); | ||
| using mpi_type = MPI_Type<T>; | ||
| DYNAMPI_MPI_CHECK(MPI_Gather, (mpi_type::ptr(data), mpi_type::count(data), mpi_type::value, | ||
| result == nullptr ? nullptr : result->data(), | ||
| mpi_type::count(data), mpi_type::value, root, m_comm)); | ||
| if constexpr (statistics_mode != StatisticsMode::None) { | ||
| _statistics.collective_count++; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: gather does not size the receive buffer; can write past end of result
On the root, result->data() is passed to MPI_Gather without ensuring capacity. The receive buffer must be pre-sized to size() * mpi_type::count(data) elements of T. This is an out-of-bounds write and may cause memory corruption or flaky tests.
Apply this diff:
template <typename T>
inline void gather(const T& data, std::vector<T>* result, int root = 0) {
DYNAMPI_ASSERT_EQ(result != nullptr, root == rank(),
"Gather result must be provided only on the root rank");
using mpi_type = MPI_Type<T>;
+ // Pre-size the receive buffer on the root
+ if (rank() == root) {
+ const auto per_rank = static_cast<size_t>(mpi_type::count(data));
+ const auto total = per_rank * static_cast<size_t>(size());
+ result->resize(total);
+ }
DYNAMPI_MPI_CHECK(MPI_Gather, (mpi_type::ptr(data), mpi_type::count(data), mpi_type::value,
- result == nullptr ? nullptr : result->data(),
+ result == nullptr ? nullptr : result->data(),
mpi_type::count(data), mpi_type::value, root, m_comm));
if constexpr (statistics_mode != StatisticsMode::None) {
_statistics.collective_count++;
}
}Follow-up: Consider documenting that this is MPI_Gather (fixed count per rank). For variable-size payloads, add a dedicated gatherv helper.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| template <typename T> | |
| inline void gather(const T& data, std::vector<T>* result, int root = 0) { | |
| DYNAMPI_ASSERT_EQ(result != nullptr, root == rank(), | |
| "Gather result must be provided only on the root rank"); | |
| using mpi_type = MPI_Type<T>; | |
| DYNAMPI_MPI_CHECK(MPI_Gather, (mpi_type::ptr(data), mpi_type::count(data), mpi_type::value, | |
| result == nullptr ? nullptr : result->data(), | |
| mpi_type::count(data), mpi_type::value, root, m_comm)); | |
| if constexpr (statistics_mode != StatisticsMode::None) { | |
| _statistics.collective_count++; | |
| } | |
| } | |
| template <typename T> | |
| inline void gather(const T& data, std::vector<T>* result, int root = 0) { | |
| DYNAMPI_ASSERT_EQ(result != nullptr, root == rank(), | |
| "Gather result must be provided only on the root rank"); | |
| using mpi_type = MPI_Type<T>; | |
| // Pre-size the receive buffer on the root | |
| if (rank() == root) { | |
| const auto per_rank = static_cast<size_t>(mpi_type::count(data)); | |
| const auto total = per_rank * static_cast<size_t>(size()); | |
| result->resize(total); | |
| } | |
| DYNAMPI_MPI_CHECK(MPI_Gather, (mpi_type::ptr(data), mpi_type::count(data), mpi_type::value, | |
| result == nullptr ? nullptr : result->data(), | |
| mpi_type::count(data), mpi_type::value, root, m_comm)); | |
| if constexpr (statistics_mode != StatisticsMode::None) { | |
| _statistics.collective_count++; | |
| } | |
| } |
🤖 Prompt for AI Agents
In include/dynampi/mpi/mpi_communicator.hpp around lines 202–213, the root rank
passes result->data() to MPI_Gather without pre-sizing result, which can cause
out-of-bounds writes; before calling MPI_Gather, if root == rank() resize result
to comm_size * mpi_type::count(data) (where comm_size is the communicator size)
so the receive buffer can hold recvcount-per-rank * num_ranks elements, and only
pass result->data() when result is non-null; retain the current MPI_Gather call
arguments and consider adding a comment that this is fixed-count MPI_Gather (and
that variable sizes require MPI_Gatherv or a helper).
| #if __has_include(<source_location>) | ||
| #include <source_location> | ||
| #if defined(__cpp_lib_source_location) | ||
| #define DYNAMPI_HAS_SOURCE_LOCATION | ||
| #endif | ||
| #elif __has_include(<experimental/source_location>) | ||
| #include <experimental/source_location> | ||
| namespace std { | ||
| using source_location = std::experimental::source_location; | ||
| } | ||
| #define DYNAMPI_HAS_SOURCE_LOCATION | ||
| #endif | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Missing <string_view> include and brittle filename extraction.
- std::string_view is used but <string_view> is not included — this can fail on conforming standard libraries.
- The filename derivation relies on '/' and wraparound on npos; handle both '/' and '\' explicitly and avoid overflow.
#ifndef NDEBUG
#include <iostream>
#include <optional>
+#include <string_view>
@@
inline void _DYNAMPI_FAIL_ASSERT(const std::string &condition_str,
const std::optional<std::string> &message
#ifdef DYNAMPI_HAS_SOURCE_LOCATION
,
const std::source_location &loc = std::source_location::current()
#endif
) {
if (!std::uncaught_exceptions()) {
std::stringstream ss;
#ifdef DYNAMPI_HAS_SOURCE_LOCATION
- std::string_view s = loc.file_name();
- std::string_view filename = s.substr(s.find_last_of('/') + 1);
+ std::string_view filename = loc.file_name();
+ const std::size_t pos = filename.find_last_of("/\\");
+ if (pos != std::string_view::npos) {
+ filename.remove_prefix(pos + 1);
+ }
#endif
int rank = 0;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
ss << "DynaMPI assertion failed on rank " << rank << ": " << condition_str
<< (message ? " " + *message : "")
#ifdef DYNAMPI_HAS_SOURCE_LOCATION
<< "\n in " << loc.function_name() << " at " << filename << ":" << loc.line()
#endif
<< std::endl;Also applies to: 34-38, 60-81
🤖 Prompt for AI Agents
In include/dynampi/utilities/assert.hpp around lines 21-33 (and similarly at
lines 34-38 and 60-81), add a #include <string_view> so std::string_view is
available, and replace the brittle filename extraction logic with a safe
approach: use find_last_of("/\\") to locate the last path separator, check for
npos, and then call substr(pos+1) (or use pointer/offset logic that avoids
unsigned wraparound) to obtain the basename; ensure the code handles both '/'
and '\\' and does not rely on wraparound behavior when no separator is found.
| #include <array> | ||
| #include <cstdint> | ||
| #include <iostream> | ||
| #include <optional> | ||
| #include <set> | ||
| #include <span> | ||
| #include <tuple> | ||
| #include <vector> | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add and qualify size_t to std::size_t for portability.
This header uses size_t but doesn’t include and doesn’t qualify it. Depending on include order, this can fail to compile.
#include <array>
+#include <cstddef>
#include <cstdint>
#include <iostream>
#include <optional>
#include <set>
#include <span>
#include <tuple>
#include <vector>
-template <typename T, size_t N>
+template <typename T, std::size_t N>
inline std::ostream& operator<<(std::ostream& os, const std::array<T, N>& arr);
template <typename T>
inline std::ostream& operator<<(std::ostream& os, const std::span<T>& vec);
inline std::ostream& operator<<(std::ostream& os, const std::span<T>& vec) {
os << "[";
- for (size_t i = 0; i < vec.size(); i++) {
+ for (std::size_t i = 0; i < vec.size(); ++i) {
os << vec[i];
if (i < vec.size() - 1) {
os << ", ";
}
}
return os << "]";
}
-template <typename T, size_t N>
+template <typename T, std::size_t N>
inline std::ostream& operator<<(std::ostream& os, const std::array<T, N>& arr) {
os << "[";
- for (size_t i = 0; i < arr.size(); i++) {
+ for (std::size_t i = 0; i < arr.size(); ++i) {
os << arr[i];
if (i < arr.size() - 1) {
os << ", ";
}
}
return os << "]";
}Also applies to: 23-26, 41-51, 58-68
🤖 Prompt for AI Agents
In include/dynampi/utilities/printing.hpp around lines 8-16 (and also apply same
changes at 23-26, 41-51, 58-68): the header uses size_t without including
<cstddef> and without qualifying it, which can break depending on include order;
add #include <cstddef> to the top of the file and change all unqualified
occurrences of size_t to std::size_t in the listed ranges (and any other uses in
this header) so the type is always available and correctly namespaced.
| if constexpr (!DistributerWrapper::template type<int, int>::ordered) { | ||
| std::sort(result->begin(), result->end()); | ||
| } | ||
| EXPECT_EQ(result, std::vector<char>({'H', 'i'})); | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect comparison: comparing optional to vector instead of its value
result is std::optional<std::vector<char>>. EXPECT_EQ(result, std::vector<char>{...}) relies on a non-existent overload and will fail to compile on standard libcs, or compare the wrong things if an ADL match exists. Compare the contained vector.
Apply this diff:
- if constexpr (!DistributerWrapper::template type<int, int>::ordered) {
- std::sort(result->begin(), result->end());
- }
- EXPECT_EQ(result, std::vector<char>({'H', 'i'}));
+ if constexpr (!DistributerWrapper::template type<int, int>::ordered) {
+ std::sort(result->begin(), result->end());
+ }
+ EXPECT_EQ(*result, std::vector<char>({'H', 'i'}));📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if constexpr (!DistributerWrapper::template type<int, int>::ordered) { | |
| std::sort(result->begin(), result->end()); | |
| } | |
| EXPECT_EQ(result, std::vector<char>({'H', 'i'})); | |
| } else { | |
| if constexpr (!DistributerWrapper::template type<int, int>::ordered) { | |
| std::sort(result->begin(), result->end()); | |
| } | |
| EXPECT_EQ(*result, std::vector<char>({'H', 'i'})); | |
| } else { |
🤖 Prompt for AI Agents
In test/mpi/test_distributers.cpp around lines 85-89, the test compares the
std::optional<std::vector<char>> variable `result` directly to a
std::vector<char>, which is incorrect and may not compile; change the assertion
to first ensure the optional has a value (ASSERT_TRUE(result.has_value()) or
ASSERT_TRUE(result)) and then compare the contained vector
(EXPECT_EQ(result.value(), std::vector<char>{'H','i'}) or EXPECT_EQ(*result,
std::vector<char>{'H','i'})).
Summary by CodeRabbit
New Features
Refactor
Tests
Chores