Race batch PDLP and dual simplex in strong branching / reliability branching by Kh4ster · Pull Request #994 · NVIDIA/cuopt

Kh4ster · 2026-03-25T17:26:51Z

This PR enables the following regarding batch PDLP:

Enable batch PDLP in reliability branching
Add work stealing so that batch PDLP and Dual Simplex can run concurrently and steal LPs from each other if one solves it first
Use correct problem representation with cuts for batch PDLP
Use a PDLP warm start cache across strong branching at the root and in reliability branching
Increase tolerance on batch PDLP to have higher quality solution
Increase iteration limit to allow instances that needs a high iteration count (with low cost per iteration) to still come through (only while solving the original LP to get warm start data)
Multiple heuristics to not run batch PDLP to not create overheads when Dual Simplex is clearly superior
Don't store and copy primal dual solution unless need it to save on memory
Handle batch PDLP errors better, allowing Dual Simplex to still continue in strong branching even if BPDLP fails
No early exit if the initial warm start PDLP solution is already feasible in BPDLP
Correct objective for BPDLP when there is an offset

Currently we still keep BPDLP off by default both at the root and in reliability branching

…ng_branching

… buffer

…+ fail safe if batch pdlp fails

…without cuts

…te_lp_problem' into race_strong_branching_realibility_branching

…ng, correctly fill the ds_obj objective before merging results at the root, correctly clamp the PDLP objective, remove the unucessary cuopt_assert regarding fixed point error

Kh4ster · 2026-04-01T16:41:30Z

/ok to test de8fac6

chris-maes

LGTM. I mostly checked the interface code. I did not review as the deep changes to support reliability branching in batch PDLP.

…ranching

Kh4ster · 2026-04-02T08:19:46Z

/ok to test 8f0e2f6

Kh4ster · 2026-04-02T14:31:17Z

Still waiting for one last thing before merging

…ranching

Kh4ster · 2026-04-02T15:22:09Z

/ok to test 329c255

…ranching

Kh4ster · 2026-04-02T16:05:13Z

/ok to test aecae93

…ranching

Kh4ster · 2026-04-03T07:59:02Z

/ok to test 0d5ae1c

Kh4ster · 2026-04-03T08:00:02Z

@coderabbitai review

coderabbitai · 2026-04-03T08:00:08Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Kh4ster · 2026-04-03T08:01:30Z

/ok to test 7504340

Kh4ster · 2026-04-03T08:05:32Z

/ok to test a7f79fa

…ranching

Kh4ster · 2026-04-03T08:56:57Z

/ok to test dd07576

Kh4ster · 2026-04-03T08:57:03Z

@coderabbitai full review

coderabbitai · 2026-04-03T08:57:09Z

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

cpp/src/branch_and_bound/pseudo_costs.cpp (1)

547-584: ⚠️ Potential issue | 🟠 Major

Assert or remap the slack-column suffix invariant.

remove_columns(cols_to_remove) treats new_slacks as arbitrary column indices, but Lines 560-584 still feed the first n entries of root_soln, objective, lower, and upper into PDLP. That only stays aligned if every slack index already lives in [n, lp.num_cols). Otherwise the compacted matrix, the vectors, and the branching indices no longer describe the same columns.
Minimal guard if appended slacks are the intended invariant
   cuopt::mps_parser::mps_data_model_t<i_t, f_t> mps_model;
   int m = lp.num_rows;
   int n = lp.num_cols - new_slacks.size();
   original_root_soln_x.resize(n);
+  for (i_t slack : new_slacks) {
+    assert(slack >= n &&
+           "simplex_problem_to_mps_data_model assumes slack columns are appended");
+  }
If that invariant is not guaranteed, this helper needs a real old→new column map and the callers need to remap the candidate indices as well.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 547 - 584, The code
removes columns via A_no_slacks.remove_columns(cols_to_remove) but still blindly
slices the first n entries of root_soln, lp.objective, lp.lower, lp.upper into
PDLP (original_root_soln_x and mps_model.set_* calls), which is only correct if
all slack indices in new_slacks are appended (>= n); add an explicit invariant
check or build a proper old→new column map: either assert all new_slacks[j] >= n
before proceeding (and document the invariant), or construct a mapping from
original column index → compacted index after remove_columns and use it to remap
original_root_soln_x, the objective vector, lower/upper bound arrays, and any
branching candidate indices so the CSR matrix and these vectors stay aligned
(refer to cols_to_remove, A_no_slacks, original_root_soln_x,
mps_model.set_objective_coefficients, set_variable_lower_bounds,
set_variable_upper_bounds).

cpp/src/pdlp/pdlp.cu (1)

801-864: ⚠️ Potential issue | 🟠 Major

Mark the last resized-batch climbers as solved before this return.

Unlike Line 865, this early-return branch never calls sb_view_.mark_solved(...) for the climbers still left in the batch. Any Dual Simplex task that has not started those branches yet will keep solving them even though PDLP already finished them, so the final chunk cannot be stolen/skipped.

Suggested fix

     if (original_batch_size != climber_strategies_.size()) {
 `#ifdef` BATCH_VERBOSE_MODE
       std::cout << "Original batch size was " << original_batch_size << " but is now "
                 << climber_strategies_.size() << std::endl;
 `#endif`
@@
         if (current_termination_strategy_.get_termination_status(i) !=
             pdlp_termination_status_t::ConcurrentLimit) {
           batch_solution_to_return_
             .get_additional_termination_informations()[climber_strategies_[i].original_index]
             .solved_by = method_t::PDLP;
         }
       }
+      if (sb_view_.is_valid()) {
+        for (size_t i = 0; i < climber_strategies_.size(); ++i) {
+          sb_view_.mark_solved(climber_strategies_[i].original_index);
+        }
+      }
       current_termination_strategy_.fill_gpu_terms_stats(total_pdlp_iterations_);
       RAFT_CUDA_TRY(cudaStreamSynchronize(stream_view_));
       current_termination_strategy_.convert_gpu_terms_stats_to_host(
         batch_solution_to_return_.get_additional_termination_informations());
       return optimization_problem_solution_t<i_t, f_t>{

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/pdlp.cu` around lines 801 - 864, This early-return path fails to
notify the solver-bookkeeping that the remaining resized-batch climbers are
finished; before returning, iterate over
current_termination_strategy_.get_terminations_status().size() and call
sb_view_.mark_solved(...) for each climber using
climber_strategies_[i].original_index (and the appropriate termination/solution
metadata you have just copied into batch_solution_to_return_), mirroring the
behavior used at the later return (line 865) so in-progress Dual Simplex tasks
can stop/skip those chunks.

♻️ Duplicate comments (6)

cpp/src/branch_and_bound/pseudo_costs.hpp (1)

416-425: ⚠️ Potential issue | 🟠 Major

Invalidate the PDLP warm-start cache when the LP context changes.

Once populated flips true, this struct has no reset path, so reusing the same pseudo_costs_t after cuts/slacks/original-LP changes can replay stale warm-start vectors and scalars into a different batch LP. Add a reset() helper here and call it from the context-reset path.

Suggested fix

 struct batch_pdlp_warm_cache_t {
   const raft::handle_t batch_pdlp_handle{};
   rmm::device_uvector<f_t> initial_primal{0, batch_pdlp_handle.get_stream()};
   rmm::device_uvector<f_t> initial_dual{0, batch_pdlp_handle.get_stream()};
   f_t step_size{std::numeric_limits<f_t>::signaling_NaN()};
   f_t primal_weight{std::numeric_limits<f_t>::signaling_NaN()};
   i_t pdlp_iteration{-1};
   f_t percent_solved_by_batch_pdlp_at_root{f_t(0.0)};
   bool populated{false};
+
+  void reset()
+  {
+    initial_primal.resize(0, batch_pdlp_handle.get_stream());
+    initial_dual.resize(0, batch_pdlp_handle.get_stream());
+    step_size                           = std::numeric_limits<f_t>::signaling_NaN();
+    primal_weight                       = std::numeric_limits<f_t>::signaling_NaN();
+    pdlp_iteration                      = -1;
+    percent_solved_by_batch_pdlp_at_root = f_t(0.0);
+    populated                           = false;
+  }
 };

As per coding guidelines "Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.hpp` around lines 416 - 425, The
batch_pdlp_warm_cache_t struct can retain stale warm-start data after the LP
context changes because there is no reset path; add a member method reset() on
batch_pdlp_warm_cache_t that clears initial_primal and initial_dual
(release/resize to 0 on batch_pdlp_handle.get_stream()), sets step_size and
primal_weight back to NaN, pdlp_iteration to -1,
percent_solved_by_batch_pdlp_at_root to f_t(0.0) and populated to false, and
then call this reset() from the pseudo_costs_t code path that
reinitializes/changes the LP context (where the context-reset currently occurs)
so any time cuts/slacks/original-LP are changed the warm cache is invalidated.

cpp/src/pdlp/solve.cu (2)

1049-1057: ⚠️ Potential issue | 🟠 Major

Validate shared_sb_solved before taking subspans.

The only guard here is !empty(). If the caller passes fewer solved flags than new_bounds.size(), subspan(i, current_batch_size) walks past the end of the coordination array and corrupts cooperative termination state. Check the span length once before the loop.

Suggested fix

+  cuopt_assert(settings.shared_sb_solved.empty() ||
+                 settings.shared_sb_solved.size() >= max_batch_size,
+               "shared_sb_solved must cover every batch entry");
   for (size_t i = 0; i < max_batch_size; i += optimal_batch_size) {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/solve.cu` around lines 1049 - 1057, The code only checks
settings.shared_sb_solved.empty() but not its length relative to new_bounds, so
subspan(i, current_batch_size) can out-of-bounds; before the batching loop
validate that when settings.shared_sb_solved is non-empty its size() (or
extent()) is >= max_batch_size (or original_new_bounds.size()), and if not,
either trim max_batch_size/current_batch_size accordingly or return/error; then
inside the loop you can safely call settings.shared_sb_solved.subspan(i,
current_batch_size) to populate batch_settings.shared_sb_solved. Ensure you
reference settings.shared_sb_solved, original_new_bounds/new_bounds,
max_batch_size and subspan in the check and handle the shorter span case
consistently.

968-972: ⚠️ Potential issue | 🟠 Major

Don't let sub_batch_size bypass the memory ceiling.

memory_max_batch_size is computed specifically to avoid OOM, but a positive settings.sub_batch_size overwrites it unchecked. A caller can request a sub-batch larger than the memory-checked limit and reintroduce the allocation failure this code just avoided.

Suggested fix

   size_t optimal_batch_size = use_optimal_batch_size
                                 ? detail::optimal_batch_size_handler(problem, memory_max_batch_size)
                                 : max_batch_size;
-  if (settings.sub_batch_size > 0) { optimal_batch_size = settings.sub_batch_size; }
+  if (settings.sub_batch_size > 0) {
+    const auto requested_sub_batch_size = static_cast<size_t>(settings.sub_batch_size);
+    cuopt_assert(requested_sub_batch_size <= memory_max_batch_size,
+                 "sub_batch_size exceeds the memory-checked batch size");
+    optimal_batch_size = requested_sub_batch_size;
+  }

As per coding guidelines "Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/solve.cu` around lines 968 - 972, The current logic lets
settings.sub_batch_size override the memory-checked limit, which can reintroduce
OOMs; instead, when computing optimal_batch_size in solve.cu (symbols:
optimal_batch_size, settings.sub_batch_size, memory_max_batch_size,
detail::optimal_batch_size_handler, max_batch_size, cuopt_assert), clamp
settings.sub_batch_size to the memory ceiling (e.g., use
min(settings.sub_batch_size, memory_max_batch_size) or validate and reject
values > memory_max_batch_size) so the final optimal_batch_size never exceeds
memory_max_batch_size (and still respects max_batch_size), then keep the
cuopt_assert as a final sanity check.

cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp (1)

303-305: ⚠️ Potential issue | 🔴 Critical

Keep batch solution collection opt-out, not opt-in.

With this default at false, generic batch LP solves now return empty primal/dual/reduced-cost buffers unless every caller remembers to opt in. Existing batch paths in cpp/tests/linear_programming/pdlp_test.cu still slice those buffers unconditionally, so this regresses the non-MIP batch API.

Suggested fix

-  // By default to save memory and speed we don't store and copy each climber's primal and dual
-  // solutions We only retrieve termination statistics and the objective values
-  bool generate_batch_primal_dual_solution{false};
+  // Preserve the historical batch behavior and collect per-climber solutions unless callers opt out.
+  bool generate_batch_primal_dual_solution{true};

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp` around lines
303 - 305, Defaulting generate_batch_primal_dual_solution to false makes batch
solves opt‑in and breaks callers that expect populated primal/dual/reduced-cost
buffers; change the bool generate_batch_primal_dual_solution default to true (so
batch solution collection is enabled by default/opt‑out) and update the adjacent
comment to reflect "opt-out" semantics so existing tests and code that slice
these buffers continue to work without modification.

cpp/src/branch_and_bound/shared_strong_branching_context.hpp (1)

38-56: ⚠️ Potential issue | 🟠 Major

Make the view helpers release-safe and overflow-safe.

These accessors still rely on assert, so a bad index becomes unchecked UB in release builds. subview() also computes offset + count in i_t before the cast, which can overflow; validate the casted sizes separately and use those converted indices for operator[]/subspan.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp` around lines 38
- 56, The accessors (is_solved, mark_solved, subview) use asserts and can
overflow in release builds; replace the assert-based checks with runtime bounds
validation and throw std::out_of_range on failure, and avoid signed arithmetic
overflow in subview by converting indices to size_t first. Concretely: in
is_solved(i_t local_idx) and mark_solved(i_t local_idx) cast local_idx to size_t
(e.g. size_t idx = static_cast<size_t>(local_idx)), verify local_idx >= 0 and
idx < solved.size(), and throw std::out_of_range with a clear message instead of
asserting; in subview(i_t offset, i_t count) cast both to size_t (size_t off =
static_cast<size_t>(offset), cnt = static_cast<size_t>(count)), ensure offset
and count are non-negative, check off <= solved.size() and cnt <= solved.size()
- off, then return {solved.subspan(off, cnt)} to avoid offset+count overflow.

cpp/tests/linear_programming/pdlp_test.cu (1)

2243-2254: ⚠️ Potential issue | 🟠 Major

Replace the fixed 200 ms sleep with deterministic synchronization.

On afiro-sized batches PDLP can finish before the delay. That makes these tests flaky and can skip the ConcurrentLimit path entirely. Use a latch/promise/barrier that the solver thread signals once the cooperative path is live, then apply the external marks.

Also applies to: 2309-2320
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/linear_programming/pdlp_test.cu` around lines 2243 - 2254, The
fixed sleep is flaky; replace it with a deterministic synchronization (e.g.,
std::promise/std::future or std::latch) so the main thread waits until the
solver thread actually enters the cooperative DS path. Create a promise/future
(or latch), pass the signal into the pdlp_thread lambda (or have solve_lp expose
a hook) and have the solver thread set the promise when the cooperative path is
live; then have the main thread wait on the future before calling
sb_view.mark_solved(i). Apply the same change to the other occurrence around the
second pdlp_thread usage.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 1044-1054: When simplex_iteration_limit < 1 the code calls
initialize_pseudo_costs_with_estimate(...) which populates pc.strong_branch_down
and pc.strong_branch_up, but the function continues and later merges/overwrites
those estimates with UNSET/NaN values; to fix, add an early return (or
conditional skip of the merge/overwrite block) immediately after calling
initialize_pseudo_costs_with_estimate so the function exits with the
estimate-only pseudo-costs intact and does not proceed to the later
merge/overwrite logic before update_pseudo_costs_from_strong_branching().
- Around line 1587-1595: The current gate "if (rb_mode != 2)" skips the whole
taskloop when rb_mode==2 even if use_pdlp is false, so unreliable_list
candidates never get evaluated; change the condition to allow Dual Simplex
fallback when PDLP is disabled by using "if (rb_mode != 2 || !use_pdlp)" (or
logically equivalent) around the pragma/taskloop that references score_mutex,
sb_view, dual_simplex_obj_down/up, dual_simplex_status_down/up and
unreliable_list, and update any related log text so it correctly reflects that
DS will run when PDLP is unavailable.

In `@cpp/src/branch_and_bound/pseudo_costs.hpp`:
- Line 541: The pdlp_warm_cache member on pseudo_costs_t is mutable shared state
and can race when multiple workers perform reliability/strong branching; make
the cache worker-local or add synchronization: either move the
batch_pdlp_warm_cache_t<i_t, f_t> pdlp_warm_cache out of the shared
pseudo_costs_t (create a stack/local instance inside the branching worker
routine) or add a dedicated omp_mutex_t (e.g., pdlp_warm_cache_mutex) and wrap
every access/update to pdlp_warm_cache in a lock/unlock pair (similar to
existing omp_atomic_t/omp_mutex_t usage) to prevent cross-contamination of
device buffers, scalars, and populated state. Ensure all functions that read or
write pdlp_warm_cache (reliability/strong branching paths) use the new mutex or
the local per-worker instance.

In `@cpp/src/dual_simplex/simplex_solver_settings.hpp`:
- Around line 194-197: The member mip_batch_pdlp_reliability_branching is not
initialized in the class constructor, leaving it with indeterminate value;
update the constructor initializer list (next to
mip_batch_pdlp_strong_branching(0)) to include
mip_batch_pdlp_reliability_branching(0) so the default-constructed
SimplexSolverSettings/related class always sets this field to a defined value.

In `@cpp/src/mip_heuristics/CMakeLists.txt`:
- Line 17: Remove clique_table.cu from the MIP_LP_NECESSARY_FILES list and add
it to the MIP_NON_LP_FILES list in the CMakeLists.txt so that clique_table.cu
(which includes mip_heuristics/utils.cuh) is built only as MIP-non-LP code;
specifically, edit the lines that construct the MIP_LP_NECESSARY_FILES and
MIP_NON_LP_FILES variables to stop listing clique_table.cu in the former and
include it in the latter, and verify any targets or install/grouping logic that
consumes those variables still references the file from the new list.

In `@cpp/src/pdlp/pdlp.cuh`:
- Around line 142-143: The initialization of sb_view_
(dual_simplex::shared_strong_branching_context_view_t<i_t, f_t>) is passing
settings_.shared_sb_solved which is a cuda::std::span<std::atomic<int>> into a
constructor expecting std::span<std::atomic<int>>; explicitly construct a
std::span from the underlying data and size (use
settings_.shared_sb_solved.data() and settings_.shared_sb_solved.size()) so the
types match instead of relying on an implicit conversion between cuda::std::span
and std::span.

In `@cpp/src/pdlp/solve.cu`:
- Around line 897-903: The batch memory estimator always accounts for three full
output buffers but the actual allocations for the third (solution collection)
buffer are skipped when generate_batch_primal_dual_solution is false; update the
total_memory calculation in solve.cu so it conditionally charges the third
output buffer only when generate_batch_primal_dual_solution is true (adjust the
same logic around the corresponding block later in the function where
allocations are made; see references to total_memory, trial_batch_size,
problem.get_n_variables(), problem.get_n_constraints(), and
generate_batch_primal_dual_solution and apply the same fix for the duplicate
estimator at the other occurrence mentioned in the comment).

In `@cpp/src/pdlp/solver_settings.cu`:
- Around line 62-88: The bulk warm-start path in set_pdlp_warm_start_data is
writing initial_primal_weight and initial_step_size directly and bypasses the
numeric guards; update set_pdlp_warm_start_data to validate those values the
same way as set_initial_primal_weight and set_initial_step_size (either by
calling those two setters from set_pdlp_warm_start_data or by applying the same
cuopt_expects checks: >0, !isinf, !isnan) so the warm-start overload enforces
the same invariants before assigning initial_primal_weight_ and
initial_step_size_.

In `@cpp/src/pdlp/translate.hpp`:
- Line 12: Remove the unused CUDA device header include from translate.hpp:
delete the line including <mip_heuristics/problem/problem.cuh> because
translate.hpp only references detail::problem_t from
optimization_problem_interface.hpp and does not use any symbols from
problem.cuh; after removal, confirm translate.hpp still compiles standalone and
that no other code relies on that include being present.

---

Outside diff comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 547-584: The code removes columns via
A_no_slacks.remove_columns(cols_to_remove) but still blindly slices the first n
entries of root_soln, lp.objective, lp.lower, lp.upper into PDLP
(original_root_soln_x and mps_model.set_* calls), which is only correct if all
slack indices in new_slacks are appended (>= n); add an explicit invariant check
or build a proper old→new column map: either assert all new_slacks[j] >= n
before proceeding (and document the invariant), or construct a mapping from
original column index → compacted index after remove_columns and use it to remap
original_root_soln_x, the objective vector, lower/upper bound arrays, and any
branching candidate indices so the CSR matrix and these vectors stay aligned
(refer to cols_to_remove, A_no_slacks, original_root_soln_x,
mps_model.set_objective_coefficients, set_variable_lower_bounds,
set_variable_upper_bounds).

In `@cpp/src/pdlp/pdlp.cu`:
- Around line 801-864: This early-return path fails to notify the
solver-bookkeeping that the remaining resized-batch climbers are finished;
before returning, iterate over
current_termination_strategy_.get_terminations_status().size() and call
sb_view_.mark_solved(...) for each climber using
climber_strategies_[i].original_index (and the appropriate termination/solution
metadata you have just copied into batch_solution_to_return_), mirroring the
behavior used at the later return (line 865) so in-progress Dual Simplex tasks
can stop/skip those chunks.

---

Duplicate comments:
In `@cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp`:
- Around line 303-305: Defaulting generate_batch_primal_dual_solution to false
makes batch solves opt‑in and breaks callers that expect populated
primal/dual/reduced-cost buffers; change the bool
generate_batch_primal_dual_solution default to true (so batch solution
collection is enabled by default/opt‑out) and update the adjacent comment to
reflect "opt-out" semantics so existing tests and code that slice these buffers
continue to work without modification.

In `@cpp/src/branch_and_bound/pseudo_costs.hpp`:
- Around line 416-425: The batch_pdlp_warm_cache_t struct can retain stale
warm-start data after the LP context changes because there is no reset path; add
a member method reset() on batch_pdlp_warm_cache_t that clears initial_primal
and initial_dual (release/resize to 0 on batch_pdlp_handle.get_stream()), sets
step_size and primal_weight back to NaN, pdlp_iteration to -1,
percent_solved_by_batch_pdlp_at_root to f_t(0.0) and populated to false, and
then call this reset() from the pseudo_costs_t code path that
reinitializes/changes the LP context (where the context-reset currently occurs)
so any time cuts/slacks/original-LP are changed the warm cache is invalidated.

In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp`:
- Around line 38-56: The accessors (is_solved, mark_solved, subview) use asserts
and can overflow in release builds; replace the assert-based checks with runtime
bounds validation and throw std::out_of_range on failure, and avoid signed
arithmetic overflow in subview by converting indices to size_t first.
Concretely: in is_solved(i_t local_idx) and mark_solved(i_t local_idx) cast
local_idx to size_t (e.g. size_t idx = static_cast<size_t>(local_idx)), verify
local_idx >= 0 and idx < solved.size(), and throw std::out_of_range with a clear
message instead of asserting; in subview(i_t offset, i_t count) cast both to
size_t (size_t off = static_cast<size_t>(offset), cnt =
static_cast<size_t>(count)), ensure offset and count are non-negative, check off
<= solved.size() and cnt <= solved.size() - off, then return
{solved.subspan(off, cnt)} to avoid offset+count overflow.

In `@cpp/src/pdlp/solve.cu`:
- Around line 1049-1057: The code only checks settings.shared_sb_solved.empty()
but not its length relative to new_bounds, so subspan(i, current_batch_size) can
out-of-bounds; before the batching loop validate that when
settings.shared_sb_solved is non-empty its size() (or extent()) is >=
max_batch_size (or original_new_bounds.size()), and if not, either trim
max_batch_size/current_batch_size accordingly or return/error; then inside the
loop you can safely call settings.shared_sb_solved.subspan(i,
current_batch_size) to populate batch_settings.shared_sb_solved. Ensure you
reference settings.shared_sb_solved, original_new_bounds/new_bounds,
max_batch_size and subspan in the check and handle the shorter span case
consistently.
- Around line 968-972: The current logic lets settings.sub_batch_size override
the memory-checked limit, which can reintroduce OOMs; instead, when computing
optimal_batch_size in solve.cu (symbols: optimal_batch_size,
settings.sub_batch_size, memory_max_batch_size,
detail::optimal_batch_size_handler, max_batch_size, cuopt_assert), clamp
settings.sub_batch_size to the memory ceiling (e.g., use
min(settings.sub_batch_size, memory_max_batch_size) or validate and reject
values > memory_max_batch_size) so the final optimal_batch_size never exceeds
memory_max_batch_size (and still respects max_batch_size), then keep the
cuopt_assert as a final sanity check.

In `@cpp/tests/linear_programming/pdlp_test.cu`:
- Around line 2243-2254: The fixed sleep is flaky; replace it with a
deterministic synchronization (e.g., std::promise/std::future or std::latch) so
the main thread waits until the solver thread actually enters the cooperative DS
path. Create a promise/future (or latch), pass the signal into the pdlp_thread
lambda (or have solve_lp expose a hook) and have the solver thread set the
promise when the cooperative path is live; then have the main thread wait on the
future before calling sb_view.mark_solved(i). Apply the same change to the other
occurrence around the second pdlp_thread usage.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 982cb111-beb8-4701-9cff-5285678a13b5

📥 Commits

Reviewing files that changed from the base of the PR and between 928a1e1 and dd07576.

📒 Files selected for processing (23)

cpp/include/cuopt/linear_programming/constants.h
cpp/include/cuopt/linear_programming/mip/solver_settings.hpp
cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp
cpp/src/branch_and_bound/branch_and_bound.cpp
cpp/src/branch_and_bound/pseudo_costs.cpp
cpp/src/branch_and_bound/pseudo_costs.hpp
cpp/src/branch_and_bound/shared_strong_branching_context.hpp
cpp/src/dual_simplex/simplex_solver_settings.hpp
cpp/src/math_optimization/solver_settings.cu
cpp/src/mip_heuristics/CMakeLists.txt
cpp/src/mip_heuristics/solver.cu
cpp/src/pdlp/pdlp.cu
cpp/src/pdlp/pdlp.cuh
cpp/src/pdlp/pdlp_constants.hpp
cpp/src/pdlp/restart_strategy/pdlp_restart_strategy.cu
cpp/src/pdlp/solve.cu
cpp/src/pdlp/solver_settings.cu
cpp/src/pdlp/termination_strategy/termination_strategy.cu
cpp/src/pdlp/termination_strategy/termination_strategy.hpp
cpp/src/pdlp/translate.hpp
cpp/src/pdlp/utilities/ping_pong_graph.cu
cpp/tests/linear_programming/pdlp_test.cu
python/cuopt_server/cuopt_server/utils/linear_programming/data_definition.py

💤 Files with no reviewable changes (1)

cpp/src/pdlp/pdlp_constants.hpp

coderabbitai · 2026-04-03T09:12:32Z

cpp/src/branch_and_bound/pseudo_costs.cpp

+  if (simplex_iteration_limit < 1) {
+    initialize_pseudo_costs_with_estimate(original_lp,
+                                          settings,
+                                          root_vstatus,
+                                          root_solution,
+                                          basic_list,
+                                          nonbasic_list,
+                                          fractional,
+                                          basis_factors,
+                                          pc);
+  } else {


⚠️ Potential issue | 🟠 Major

Return after the estimate-only fallback.

When simplex_iteration_limit < 1, this branch fills pc.strong_branch_down/up from initialize_pseudo_costs_with_estimate(), but the unconditional merge later still runs with every DS status left UNSET and every PDLP value NaN. That overwrites the estimates back to NaN just before update_pseudo_costs_from_strong_branching().

Suggested fix

if (simplex_iteration_limit < 1) { initialize_pseudo_costs_with_estimate(original_lp, settings, root_vstatus, root_solution, basic_list, nonbasic_list, fractional, basis_factors, pc); + pc.update_pseudo_costs_from_strong_branching(fractional, root_solution.x); + return; } else {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 1044 - 1054, When simplex_iteration_limit < 1 the code calls initialize_pseudo_costs_with_estimate(...) which populates pc.strong_branch_down and pc.strong_branch_up, but the function continues and later merges/overwrites those estimates with UNSET/NaN values; to fix, add an early return (or conditional skip of the merge/overwrite block) immediately after calling initialize_pseudo_costs_with_estimate so the function exits with the estimate-only pseudo-costs intact and does not proceed to the later merge/overwrite logic before update_pseudo_costs_from_strong_branching().

coderabbitai · 2026-04-03T09:12:32Z

cpp/src/branch_and_bound/pseudo_costs.cpp

+  if (rb_mode != 2) {
+#pragma omp taskloop if (num_tasks > 1) priority(task_priority) num_tasks(num_tasks) \
+  shared(score_mutex,                                                                \
+           sb_view,                                                                  \
+           dual_simplex_obj_down,                                                    \
+           dual_simplex_obj_up,                                                      \
+           dual_simplex_status_down,                                                 \
+           dual_simplex_status_up,                                                   \
+           unreliable_list)


⚠️ Potential issue | 🟠 Major

Fallback to Dual Simplex when PDLP-only RB gets disabled.

This gate uses rb_mode != 2, but use_pdlp can still be false in mode 2 (empty warm cache, deterministic mode, sub-MIP, too few candidates, low root hit rate). In that case the PDLP task is skipped too, so none of the unreliable candidates are actually evaluated even though the earlier logs say "using DS only".

One way to preserve the advertised fallback

- if (rb_mode != 2) { + const bool run_dual_simplex = (rb_mode != 2) || !use_pdlp; + if (run_dual_simplex) { `#pragma` omp taskloop if (num_tasks > 1) priority(task_priority) num_tasks(num_tasks) \ shared(score_mutex, \ sb_view, \ dual_simplex_obj_down, \

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 1587 - 1595, The current gate "if (rb_mode != 2)" skips the whole taskloop when rb_mode==2 even if use_pdlp is false, so unreliable_list candidates never get evaluated; change the condition to allow Dual Simplex fallback when PDLP is disabled by using "if (rb_mode != 2 || !use_pdlp)" (or logically equivalent) around the pragma/taskloop that references score_mutex, sb_view, dual_simplex_obj_down/up, dual_simplex_status_down/up and unreliable_list, and update any related log text so it correctly reflects that DS will run when PDLP is unavailable.

coderabbitai · 2026-04-03T09:12:32Z

cpp/src/branch_and_bound/pseudo_costs.hpp

  omp_atomic_t<i_t> num_strong_branches_completed = 0;
  omp_atomic_t<int64_t> strong_branching_lp_iter  = 0;
+
+  batch_pdlp_warm_cache_t<i_t, f_t> pdlp_warm_cache;


⚠️ Potential issue | 🟠 Major

Don't store the warm-start cache as unsynchronized shared worker state.

pseudo_costs_t already uses omp_atomic_t and omp_mutex_t for data that workers update concurrently, but pdlp_warm_cache is a plain mutable member. If two workers hit reliability/strong branching at once, they can race on the cached device buffers, scalars, and populated, cross-contaminating warm starts between nodes. Make the cache worker-local or guard it with a dedicated mutex.

As per coding guidelines "Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.hpp` at line 541, The pdlp_warm_cache member on pseudo_costs_t is mutable shared state and can race when multiple workers perform reliability/strong branching; make the cache worker-local or add synchronization: either move the batch_pdlp_warm_cache_t<i_t, f_t> pdlp_warm_cache out of the shared pseudo_costs_t (create a stack/local instance inside the branching worker routine) or add a dedicated omp_mutex_t (e.g., pdlp_warm_cache_mutex) and wrap every access/update to pdlp_warm_cache in a lock/unlock pair (similar to existing omp_atomic_t/omp_mutex_t usage) to prevent cross-contamination of device buffers, scalars, and populated state. Ensure all functions that read or write pdlp_warm_cache (reliability/strong branching paths) use the new mutex or the local per-worker instance.

coderabbitai · 2026-04-03T09:12:32Z

cpp/src/dual_simplex/simplex_solver_settings.hpp

+  i_t
+    mip_batch_pdlp_strong_branching;  // 0 = DS only, 1 = cooperative DS + PDLP, 2 = batch PDLP only
+  i_t mip_batch_pdlp_reliability_branching;  // 0 = DS only, 1 = cooperative DS + PDLP, 2 = batch
+                                             // PDLP only


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '100,112p' cpp/src/dual_simplex/simplex_solver_settings.hpp echo sed -n '1,40p' cpp/src/mip_heuristics/diversity/recombiners/sub_mip.cuh echo sed -n '1,40p' cpp/src/mip_heuristics/diversity/lns/rins.cu

Repository: NVIDIA/cuopt

Length of output: 3347

🏁 Script executed:

#!/bin/bash set -euo pipefail # Get the full file to understand structure echo "=== File structure and class definition ===" wc -l cpp/src/dual_simplex/simplex_solver_settings.hpp echo -e "\n=== Constructor and initializer list (lines 40-130) ===" sed -n '40,130p' cpp/src/dual_simplex/simplex_solver_settings.hpp echo -e "\n=== Field declarations around lines 180-210 ===" sed -n '180,210p' cpp/src/dual_simplex/simplex_solver_settings.hpp echo -e "\n=== Check for in-class initializers (search for = in field declarations) ===" rg 'mip_batch_pdlp_reliability_branching|mip_batch_pdlp_strong_branching' cpp/src/dual_simplex/simplex_solver_settings.hpp -A 1 -B 1 echo -e "\n=== Check other constructors ===" rg 'simplex_solver_settings_t\s*\(' cpp/src/dual_simplex/simplex_solver_settings.hpp

Repository: NVIDIA/cuopt

Length of output: 5723

mip_batch_pdlp_reliability_branching is not initialized on default construction.

The constructor initializer list includes mip_batch_pdlp_strong_branching(0) but omits mip_batch_pdlp_reliability_branching, and the field declaration has no in-class initializer. Default-constructed instances will have an indeterminate value, violating the guideline to "validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving." Add mip_batch_pdlp_reliability_branching(0), to the constructor initializer list.

🔧 Proposed fix

cut_change_threshold(1e-3), cut_min_orthogonality(0.5), mip_batch_pdlp_strong_branching(0), + mip_batch_pdlp_reliability_branching(0), strong_branching_simplex_iteration_limit(-1), random_seed(0),

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/dual_simplex/simplex_solver_settings.hpp` around lines 194 - 197, The member mip_batch_pdlp_reliability_branching is not initialized in the class constructor, leaving it with indeterminate value; update the constructor initializer list (next to mip_batch_pdlp_strong_branching(0)) to include mip_batch_pdlp_reliability_branching(0) so the default-constructed SimplexSolverSettings/related class always sets this field to a defined value.

coderabbitai · 2026-04-03T09:12:32Z

cpp/src/mip_heuristics/CMakeLists.txt

  ${CMAKE_CURRENT_SOURCE_DIR}/presolve/third_party_presolve.cpp
  ${CMAKE_CURRENT_SOURCE_DIR}/presolve/gf2_presolve.cpp
  ${CMAKE_CURRENT_SOURCE_DIR}/solution/solution.cu
+  ${CMAKE_CURRENT_SOURCE_DIR}/presolve/conflict_graph/clique_table.cu


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Read-only verification: confirm whether clique_table LP-only inclusion pulls MIP-coupled symbols. set -euo pipefail echo "== Locate relevant files ==" fd -a 'CMakeLists\.txt$' | rg 'cpp/src/mip_heuristics/CMakeLists\.txt' fd -a 'clique_table\.cu$' fd -a 'utils\.cuh$' fd -a 'problem\.cuh$' echo echo "== CMake source set selection ==" rg -n -C3 'MIP_LP_NECESSARY_FILES|MIP_NON_LP_FILES|BUILD_LP_ONLY|clique_table\.cu' cpp/src/mip_heuristics/CMakeLists.txt echo echo "== Include chain from clique_table.cu ==" rg -n -C2 '^#include' cpp/src/mip_heuristics/presolve/conflict_graph/clique_table.cu echo echo "== MIP/problem coupling markers inside utils.cuh ==" rg -n -C2 'problem_t|mixed_integer|branch|cut|mip' cpp/src/mip_heuristics/utils.cuh

Repository: NVIDIA/cuopt

Length of output: 4157

🏁 Script executed:

# Check where problem_t is defined and understand its scope rg -n 'template.*problem_t|struct problem_t|class problem_t' cpp/src/mip_heuristics/problem/problem.cuh | head -20

Repository: NVIDIA/cuopt

Length of output: 76

🏁 Script executed:

# Check what problem_t is actually used for in utils.cuh to understand coupling severity rg -n 'problem_t' cpp/src/mip_heuristics/utils.cuh | head -30

Repository: NVIDIA/cuopt

Length of output: 441

🏁 Script executed:

# Verify whether clique_table.cu functionality is actually required for LP-only builds # by checking if there are any references to it in LP-only code paths rg -r 'clique_table' cpp/src --type-list | head -20

Repository: NVIDIA/cuopt

Length of output: 445

🏁 Script executed:

# Get the full problem_t class definition to understand its scope sed -n '56,150p' cpp/src/mip_heuristics/problem/problem.cuh

Repository: NVIDIA/cuopt

Length of output: 5038

🏁 Script executed:

# Find all files that reference clique_table (functions or includes) rg 'clique_table' cpp/src --type cpp --type cu

Repository: NVIDIA/cuopt

Length of output: 82

🏁 Script executed:

# Check if problem.cu (which is in MIP_LP_NECESSARY_FILES) actually uses problem_t in LP-only context head -50 cpp/src/mip_heuristics/problem/problem.cu

Repository: NVIDIA/cuopt

Length of output: 1548

🏁 Script executed:

# Search for references to clique_table functions/files more carefully rg 'clique_table' cpp/src --type cpp

Repository: NVIDIA/cuopt

Length of output: 3657

🏁 Script executed:

# Check problem.cuh for clique_table member and where it's initialized rg -n 'clique_table' cpp/src/mip_heuristics/problem/problem.cuh

Repository: NVIDIA/cuopt

Length of output: 143

🏁 Script executed:

# Search for clique_table_t definition rg -n 'class clique_table_t|struct clique_table_t|template.*clique_table_t' cpp/src

Repository: NVIDIA/cuopt

Length of output: 449

🏁 Script executed:

# Search for where clique_table member is initialized/used in problem code rg -n 'clique_table\s*=' cpp/src/mip_heuristics/problem/ -A2 -B2

Repository: NVIDIA/cuopt

Length of output: 38

🏁 Script executed:

# Check if clique_table building/initialization is in problem.cu or elsewhere rg -n 'clique_table_t<' cpp/src/mip_heuristics/problem/problem.cu

Repository: NVIDIA/cuopt

Length of output: 38

🏁 Script executed:

# Verify: does clique_table.cu define functions that are called from LP-safe code? rg -n '^(namespace|template|void|bool|auto|int|std::)' cpp/src/mip_heuristics/presolve/conflict_graph/clique_table.cu | head -30

Repository: NVIDIA/cuopt

Length of output: 1651

🏁 Script executed:

# Check if clique_table building is triggered from problem.cu initialization rg -n 'make_clique|build.*clique|clique.*build' cpp/src/mip_heuristics/

Repository: NVIDIA/cuopt

Length of output: 492

🏁 Script executed:

# Find all callers of build_clique_table rg -n 'build_clique_table' cpp/src --type cpp -B2 -A2

Repository: NVIDIA/cuopt

Length of output: 38

🏁 Script executed:

# Check if build_clique_table is called from problem.cu or problem-related code rg -n 'build_clique_table' cpp/src/mip_heuristics/problem/

Repository: NVIDIA/cuopt

Length of output: 38

🏁 Script executed:

# Verify the full include chain: does clique_table.cu actually get compiled in LP-only? # Check if there's any conditional compilation guards rg -n 'BUILD_LP_ONLY|#if|#ifdef' cpp/src/mip_heuristics/presolve/conflict_graph/clique_table.cu | head -20

Repository: NVIDIA/cuopt

Length of output: 506

🏁 Script executed:

# Find all references to clique_table in problem.cu rg -n 'clique_table' cpp/src/mip_heuristics/problem/problem.cu

Repository: NVIDIA/cuopt

Length of output: 221

🏁 Script executed:

# Check problem.cu to see if it initializes or uses the clique_table member sed -n '1,100p' cpp/src/mip_heuristics/problem/problem.cu | rg -n 'clique'

Repository: NVIDIA/cuopt

Length of output: 38

🏁 Script executed:

# Get summary: who actually calls build_clique_table? rg -r 'build_clique_table\(' cpp/src --type cpp

Repository: NVIDIA/cuopt

Length of output: 38

🏁 Script executed:

# Final confirmation: verify that clique_table functions are truly never called in LP-only context # by checking all instantiations and calls rg -n 'clique_table_t<' cpp/src/cuts/cuts.cpp cpp/src/branch_and_bound/branch_and_bound.cpp | head -5

Repository: NVIDIA/cuopt

Length of output: 531

🏁 Script executed:

# Double-check: are there any LP-only modules that might indirectly call clique_table? # Check what problem.cu actually does with clique_table rg -n -A5 'clique_table' cpp/src/mip_heuristics/problem/problem.cu | head -20

Repository: NVIDIA/cuopt

Length of output: 912

Move clique_table.cu from MIP_LP_NECESSARY_FILES to MIP_NON_LP_FILES

clique_table.cu is MIP-exclusive functionality (only used by branch-and-bound and cutting plane generation), yet it includes mip_heuristics/utils.cuh, which brings MIP-coupled types and headers into the LP-only build path unnecessarily. The file should be in MIP_NON_LP_FILES instead.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/CMakeLists.txt` at line 17, Remove clique_table.cu from the MIP_LP_NECESSARY_FILES list and add it to the MIP_NON_LP_FILES list in the CMakeLists.txt so that clique_table.cu (which includes mip_heuristics/utils.cuh) is built only as MIP-non-LP code; specifically, edit the lines that construct the MIP_LP_NECESSARY_FILES and MIP_NON_LP_FILES variables to stop listing clique_table.cu in the former and include it in the latter, and verify any targets or install/grouping logic that consumes those variables still references the file from the new list.

coderabbitai · 2026-04-03T09:12:32Z

cpp/src/pdlp/pdlp.cuh

+  dual_simplex::shared_strong_branching_context_view_t<i_t, f_t> sb_view_{
+    settings_.shared_sb_solved};


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== shared_sb_solved declaration ===" rg -n -C2 --type=cpp 'shared_sb_solved' cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp echo "=== shared_strong_branching_context_view_t span type ===" rg -n -C3 --type=cpp 'shared_strong_branching_context_view_t|std::span<|cuda::std::span<' cpp/src/branch_and_bound/shared_strong_branching_context.hpp echo "=== sb_view_ initialization site ===" rg -n -C2 --type=cpp 'sb_view_|shared_sb_solved' cpp/src/pdlp/pdlp.cuh

Repository: NVIDIA/cuopt

Length of output: 1441

Use explicit span conversion to ensure cuda::std::span → std::span compatibility.

The constructor at lines 142–143 receives settings_.shared_sb_solved (type cuda::std::span<std::atomic<int>>) but expects std::span<std::atomic<int>>. These are distinct types from different namespaces. Avoid relying on implicit conversion; rewrite as:

Suggested fix

dual_simplex::shared_strong_branching_context_view_t<i_t, f_t> sb_view_{ std::span<std::atomic<int>>(settings_.shared_sb_solved.data(), settings_.shared_sb_solved.size())};

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/pdlp/pdlp.cuh` around lines 142 - 143, The initialization of sb_view_ (dual_simplex::shared_strong_branching_context_view_t<i_t, f_t>) is passing settings_.shared_sb_solved which is a cuda::std::span<std::atomic<int>> into a constructor expecting std::span<std::atomic<int>>; explicitly construct a std::span from the underlying data and size (use settings_.shared_sb_solved.data() and settings_.shared_sb_solved.size()) so the types match instead of relying on an implicit conversion between cuda::std::span and std::span.

coderabbitai · 2026-04-03T09:12:32Z

cpp/src/pdlp/solve.cu

+  total_memory += problem.get_n_variables() * trial_batch_size * sizeof(f_t);
+  total_memory += problem.get_n_constraints() * trial_batch_size * sizeof(f_t);
+  total_memory += problem.get_n_variables() * trial_batch_size * sizeof(f_t);

-  // Add a 50% overhead to make sure we have enough memory considering other parts of the solver may
-  // allocate at the same time
-  total_memory *= 1.5;
+  // Add a 70% overhead to make sure we have enough memory considering other parts of the solver may
+  // need memory later while the batch PDLP is running
+  total_memory *= 1.7;


⚠️ Potential issue | 🟠 Major

Keep the batch memory estimator consistent with optional solution collection.

The estimator always charges three full output buffers, but the actual allocations later in this function are skipped when generate_batch_primal_dual_solution is false. On large strong-branching batches that can force unnecessarily tiny sub-batches or even return NumericalError for a batch that would fit without solution collection.

Also applies to: 1004-1012

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/pdlp/solve.cu` around lines 897 - 903, The batch memory estimator always accounts for three full output buffers but the actual allocations for the third (solution collection) buffer are skipped when generate_batch_primal_dual_solution is false; update the total_memory calculation in solve.cu so it conditionally charges the third output buffer only when generate_batch_primal_dual_solution is true (adjust the same logic around the corresponding block later in the function where allocations are made; see references to total_memory, trial_batch_size, problem.get_n_variables(), problem.get_n_constraints(), and generate_batch_primal_dual_solution and apply the same fix for the duplicate estimator at the other occurrence mentioned in the comment).

coderabbitai · 2026-04-03T09:12:32Z

cpp/src/pdlp/solver_settings.cu

 void pdlp_solver_settings_t<i_t, f_t>::set_initial_step_size(f_t initial_step_size)
 {
+  cuopt_expects(initial_step_size > f_t(0),
+                error_type_t::ValidationError,
+                "Initial step size must be greater than 0");
+  cuopt_expects(!std::isinf(initial_step_size),
+                error_type_t::ValidationError,
+                "Initial step size must be finite");
+  cuopt_expects(!std::isnan(initial_step_size),
+                error_type_t::ValidationError,
+                "Initial step size must be a number");
  initial_step_size_ = std::make_optional(initial_step_size);
 }

 template <typename i_t, typename f_t>
 void pdlp_solver_settings_t<i_t, f_t>::set_initial_primal_weight(f_t initial_primal_weight)
 {
+  cuopt_expects(initial_primal_weight > f_t(0),
+                error_type_t::ValidationError,
+                "Initial primal weight must be greater than 0");
+  cuopt_expects(!std::isinf(initial_primal_weight),
+                error_type_t::ValidationError,
+                "Initial primal weight must be finite");
+  cuopt_expects(!std::isnan(initial_primal_weight),
+                error_type_t::ValidationError,
+                "Initial primal weight must be a number");
  initial_primal_weight_ = std::make_optional(initial_primal_weight);


⚠️ Potential issue | 🟠 Major

The bulk warm-start path still bypasses these new numeric guards.

set_pdlp_warm_start_data(...) later in this file writes the warm-start initial_primal_weight and initial_step_size directly at Lines 316-317, so callers using that overload can still inject <=0, inf, or nan values. Please reuse the same validation there so both entry points enforce the same solver-state invariant. As per coding guidelines: "Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)".

🔧 Suggested direction

cuopt_expects(last_restart_duality_gap_dual_solution != nullptr, error_type_t::ValidationError, "last_restart_duality_gap_dual_solution cannot be null"); + cuopt_expects(initial_primal_weight > f_t(0) && + !std::isinf(initial_primal_weight) && + !std::isnan(initial_primal_weight), + error_type_t::ValidationError, + "Initial primal weight must be finite and greater than 0"); + cuopt_expects(initial_step_size > f_t(0) && + !std::isinf(initial_step_size) && + !std::isnan(initial_step_size), + error_type_t::ValidationError, + "Initial step size must be finite and greater than 0");

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/pdlp/solver_settings.cu` around lines 62 - 88, The bulk warm-start path in set_pdlp_warm_start_data is writing initial_primal_weight and initial_step_size directly and bypasses the numeric guards; update set_pdlp_warm_start_data to validate those values the same way as set_initial_primal_weight and set_initial_step_size (either by calling those two setters from set_pdlp_warm_start_data or by applying the same cuopt_expects checks: >0, !isinf, !isnan) so the warm-start overload enforces the same invariants before assigning initial_primal_weight_ and initial_step_size_.

coderabbitai · 2026-04-03T09:12:32Z

cpp/src/pdlp/translate.hpp


 #include <cuopt/linear_programming/optimization_problem_interface.hpp>

+#include <mip_heuristics/problem/problem.cuh>


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Check if problem.cuh types are used in translate.hpp and find potential compilation issues echo "=== 1. Search for usage of mip_heuristics namespace/types in translate.hpp ===" rg -n "mip_heuristics::" cpp/src/pdlp/translate.hpp || echo "No direct mip_heuristics namespace usage found" echo -e "\n=== 2. Find all files that include translate.hpp ===" rg -n '#include.*translate\.hpp' --type=cpp --type=cu -C2 echo -e "\n=== 3. Check if translate.hpp is included in non-CUDA files (.cpp) ===" rg -l '#include.*translate\.hpp' --type=cpp -g '!*.cu' || echo "No .cpp files include translate.hpp" echo -e "\n=== 4. Search for problem_t usage to verify source ===" rg -n "detail::problem_t" cpp/src/pdlp/translate.hpp echo -e "\n=== 5. Check what problem.cuh exports that might be needed ===" rg -n "^(class|struct|enum|namespace|template)" cpp/src/mip_heuristics/problem/problem.cuh | head -20

Repository: NVIDIA/cuopt

Length of output: 994

Remove unused CUDA device header include.

Line 12 adds #include <mip_heuristics/problem/problem.cuh>, but verification shows no symbols from this include are used in translate.hpp. Functions use detail::problem_t from optimization_problem_interface.hpp (line 10), not from problem.cuh. Additionally, no files currently include translate.hpp, so there is no compilation impact at present.

Remove this unused include.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/pdlp/translate.hpp` at line 12, Remove the unused CUDA device header include from translate.hpp: delete the line including <mip_heuristics/problem/problem.cuh> because translate.hpp only references detail::problem_t from optimization_problem_interface.hpp and does not use any symbols from problem.cuh; after removal, confirm translate.hpp still compiles standalone and that no other code relies on that include being present.

Kh4ster · 2026-04-03T10:22:13Z

/ok to test bb50ad1

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/pdlp/pdlp.cu`:
- Around line 780-798: Add a single upfront cuopt_assert before using
climber_strategies_[i].original_index to ensure the shared strong-branching span
matches the original batch width: when sb_view_.is_valid() is true, assert that
sb_view_ (or shared_sb_solved/span) size/width equals the original batch width
used to generate climber_strategies_ (so original_index is in-bounds for
shared_strong_branching_context_view_t). Place this check immediately inside the
if (sb_view_.is_valid()) block guarding the loop that accesses
climber_strategies_ (and add the same guard at the other occurrences noted
around the blocks using original_index at the locations corresponding to lines
~849-850, ~866-869, and ~928-929).
- Around line 789-790: When you set a climber to
pdlp_termination_status_t::ConcurrentLimit via
current_termination_strategy_.set_termination_status, do not serialize PDLP
iterates into that climber's result: either propagate the external solver's
result into the per-climber payload or skip filling primal/dual/reduced-cost
buffers, step metadata and the solved_by field for that climber. Locate the
blocks that read from pdhg_solver_/PDLP state and populate per-climber solution
data (the occurrences around the current call and the other ranges noted:
~818-849, 872-878, 897-928) and add a guard that if status ==
pdlp_termination_status_t::ConcurrentLimit then avoid populating the iterate
buffers and step metadata (or instead copy the externally-provided result
structure) and ensure solved_by is set consistently from the external-solver
branch.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 17941937-9cfb-4973-a40a-f3ce01b5a538

📥 Commits

Reviewing files that changed from the base of the PR and between dd07576 and bb50ad1.

📒 Files selected for processing (1)

cpp/src/pdlp/pdlp.cu

coderabbitai · 2026-04-03T10:29:57Z

cpp/src/pdlp/pdlp.cu

+  if (sb_view_.is_valid()) {
+    for (size_t i = 0; i < climber_strategies_.size(); ++i) {
+      // If PDLP has solved it to optimality we want to keep it and resolved both solvers having
+      // solved the problem later
+      if (current_termination_strategy_.is_done(
+            current_termination_strategy_.get_termination_status(i)))
+        continue;
+      const i_t local_idx = climber_strategies_[i].original_index;
+      if (sb_view_.is_solved(local_idx)) {
+        current_termination_strategy_.set_termination_status(
+          i, pdlp_termination_status_t::ConcurrentLimit);
+#ifdef BATCH_VERBOSE_MODE
+        std::cout << "[COOP SB] DS already solved climber " << i << " (original_index " << local_idx
+                  << "), synced to ConcurrentLimit at step " << internal_solver_iterations_
+                  << std::endl;
+#endif
+      }
+    }
+  }


⚠️ Potential issue | 🟠 Major

Validate shared_sb_solved against the original batch size before using original_index.

sb_view_.is_valid() only proves the span is non-empty. These loads/stores use climber_strategies_[i].original_index, which stays in the original batch numbering after compaction. If the caller passes a shorter span, the assert inside shared_strong_branching_context_view_t disappears in release builds and this becomes an out-of-bounds atomic access. Please add a single upfront cuopt_assert that the shared span matches the original batch width.

Suggested guard

+if (sb_view_.is_valid()) { + cuopt_assert(sb_view_.solved.size() == settings_.new_bounds.size(), + "shared_sb_solved must match the original batch size"); +} + if (sb_view_.is_valid()) { for (size_t i = 0; i < climber_strategies_.size(); ++i) {

As per coding guidelines, "Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations".

Also applies to: 849-850, 866-869, 928-929

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/pdlp/pdlp.cu` around lines 780 - 798, Add a single upfront cuopt_assert before using climber_strategies_[i].original_index to ensure the shared strong-branching span matches the original batch width: when sb_view_.is_valid() is true, assert that sb_view_ (or shared_sb_solved/span) size/width equals the original batch width used to generate climber_strategies_ (so original_index is in-bounds for shared_strong_branching_context_view_t). Place this check immediately inside the if (sb_view_.is_valid()) block guarding the loop that accesses climber_strategies_ (and add the same guard at the other occurrences noted around the blocks using original_index at the locations corresponding to lines ~849-850, ~866-869, and ~928-929).

coderabbitai · 2026-04-03T10:29:57Z

cpp/src/pdlp/pdlp.cu

+        current_termination_strategy_.set_termination_status(
+          i, pdlp_termination_status_t::ConcurrentLimit);


⚠️ Potential issue | 🟠 Major

Don't serialize PDLP iterates as the result for ConcurrentLimit climbers.

Once a climber is flipped to pdlp_termination_status_t::ConcurrentLimit, this method still fills its primal/dual/reduced-cost buffers and step metadata from pdhg_solver_ / PDLP state. That means the status says “externally finished”, but the payload is just PDLP’s last iterate, and solved_by is only written on the PDLP branch. Please either plumb the external solver result through here or skip populating per-climber solution payloads for ConcurrentLimit.

As per coding guidelines, "Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results".

Also applies to: 818-849, 872-878, 897-928

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/pdlp/pdlp.cu` around lines 789 - 790, When you set a climber to pdlp_termination_status_t::ConcurrentLimit via current_termination_strategy_.set_termination_status, do not serialize PDLP iterates into that climber's result: either propagate the external solver's result into the per-climber payload or skip filling primal/dual/reduced-cost buffers, step metadata and the solved_by field for that climber. Locate the blocks that read from pdhg_solver_/PDLP state and populate per-climber solution data (the occurrences around the current call and the other ranges noted: ~818-849, 872-878, 897-928) and add a guard that if status == pdlp_termination_status_t::ConcurrentLimit then avoid populating the iterate buffers and step metadata (or instead copy the externally-provided result structure) and ensure solved_by is set consistently from the external-solver branch.

Kh4ster and others added 30 commits February 9, 2026 17:47

implement racing for strong branching

70b251d

race on by default

6848df1

tmp

fc7aa04

fix

1614bc1

fix

4f33531

workaround for thrust reverse iterator build error

e0a530e

remove compile file

e330718

fix

dce6d4f

final cleanup

9c03faf

addtional cleanup

6c2fe35

Merge branch 'main' into fix_batch_pdlp

eb60c98

address PR comments, add tests, update doc

a43dc0c

format

c8b8b74

format

b1be5bb

Merge branch 'main' into fix_batch_pdlp

fe31e7e

style

d89af96

put back changes in restart

c7e3e22

Merge remote-tracking branch 'upstream/fix_batch_pdlp' into race_stro…

793345c

…ng_branching

fix use overall time limit, reduce memory consumtion and add a bigger…

73a52b1

… buffer

switch to double for memory estimator as size_t was hitting overflow …

dbc94fd

…+ fail safe if batch pdlp fails

add support for dual simplex warm start

8b1ec93

Merge branch 'main' into race_strong_branching

f97d817

handle batch pdlp being out of memory not has teramintion error

71e47eb

add a basic batch pdlp race strategy in strong branching

d025441

Merge branch 'main' into race_strong_branching_realibility_branching

b43d7f3

fix compilation issue

3044887

separate the two batch pdlp settings

0108de4

Fix bug where batch PDLP for strong branching was running on problem …

721a56a

…without cuts

Merge remote-tracking branch 'chris/strong_branching_batch_lp_transla…

13ae8b4

…te_lp_problem' into race_strong_branching_realibility_branching

pass slack and correct problem convertion also in reliabiltiy branchi…

ba1e4bd

…ng, correctly fill the ds_obj objective before merging results at the root, correctly clamp the PDLP objective, remove the unucessary cuopt_assert regarding fixed point error

remove the clamping on BPDLP objective in RB

de8fac6

tmckayus approved these changes Apr 1, 2026

View reviewed changes

chris-maes approved these changes Apr 1, 2026

View reviewed changes

Merge branch 'release/26.04' into race_strong_branching_realibility_b…

8f0e2f6

…ranching

Merge branch 'release/26.04' into race_strong_branching_realibility_b…

329c255

…ranching

Merge branch 'release/26.04' into race_strong_branching_realibility_b…

aecae93

…ranching

chris-maes mentioned this pull request Apr 2, 2026

Fix bug where batch PDLP for strong branching was running on problem without cuts #951

Closed

Merge branch 'release/26.04' into race_strong_branching_realibility_b…

0d5ae1c

…ranching

style

7504340

style

a7f79fa

Kh4ster added 3 commits April 3, 2026 10:46

fix parameter name

ae8a8a2

Merge branch 'release/26.04' into race_strong_branching_realibility_b…

fdcdf0f

…ranching

remove one useless print

dd07576

coderabbitai bot reviewed Apr 3, 2026

View reviewed changes

fix missig solved

bb50ad1

coderabbitai bot reviewed Apr 3, 2026

View reviewed changes

		dual_simplex::shared_strong_branching_context_view_t<i_t, f_t> sb_view_{
		settings_.shared_sb_solved};


		#include <cuopt/linear_programming/optimization_problem_interface.hpp>

		#include <mip_heuristics/problem/problem.cuh>

		current_termination_strategy_.set_termination_status(
		i, pdlp_termination_status_t::ConcurrentLimit);

Conversation

Kh4ster commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kh4ster commented Apr 1, 2026

Uh oh!

chris-maes left a comment

Choose a reason for hiding this comment

Uh oh!

Kh4ster commented Apr 2, 2026

Uh oh!

Kh4ster commented Apr 2, 2026

Uh oh!

Kh4ster commented Apr 2, 2026

Uh oh!

Kh4ster commented Apr 2, 2026

Uh oh!

Kh4ster commented Apr 3, 2026

Uh oh!

Kh4ster commented Apr 3, 2026

Uh oh!

coderabbitai bot commented Apr 3, 2026

Uh oh!

Kh4ster commented Apr 3, 2026

Uh oh!

Kh4ster commented Apr 3, 2026

Uh oh!

Kh4ster commented Apr 3, 2026

Uh oh!

Kh4ster commented Apr 3, 2026

Uh oh!

coderabbitai bot commented Apr 3, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Kh4ster commented Apr 3, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Kh4ster commented Mar 25, 2026 •

edited

Loading