Add exception handling for pdlp in concurrent mode#966
Add exception handling for pdlp in concurrent mode#966Iroy30 wants to merge 9 commits intoNVIDIA:mainfrom
Conversation
📝 WalkthroughWalkthroughThe change adds exception handling scaffolding to the concurrent PDLP runner. It wraps the main execution in try/catch, capturing exceptions in an std::exception_ptr variable and signaling concurrent_halt on failure while preserving existing behavior when no exception occurs. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment Tip You can make CodeRabbit's review stricter and more nitpicky using the `assertive` profile, if that's what you prefer.Change the |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
cpp/src/pdlp/solve.cu (1)
1190-1192: TODO lacks issue tracking; commented-out code should be cleaned up.The TODO indicates a known intermittent issue but provides no way to track resolution. Additionally, commented-out code with
printfstatements should not remain in production code.Recommendations:
- Create a tracked issue and reference it in the comment (e.g.,
// TODO(#123): ...)- Either remove the commented-out rethrow entirely, or conditionally enable it behind a debug flag if needed for investigation
♻️ Suggested cleanup
- // TODO: Active Issue: PDLP throws an Exception interminttently. - // if (pdlp_exception) { printf("Rethrowing PDLP exception from concurrent mode\n"); - // std::rethrow_exception(pdlp_exception); } + // TODO(`#ISSUE_NUMBER`): PDLP throws exceptions intermittently in concurrent mode. + // Once root cause is fixed, rethrow: std::rethrow_exception(pdlp_exception);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/pdlp/solve.cu` around lines 1190 - 1192, Remove the dead commented-out rethrow and replace the vague TODO with a tracked issue reference and optional debug guard: create a repository issue for the intermittent PDLP exception and update the comment to include the issue ID (e.g., "// TODO(#<issue>): Intermittent PDLP exception observed when running concurrent mode"), and either delete the commented block that references pdlp_exception/printf or move that logic behind a compile-time/debug flag (e.g., PDLP_DEBUG_RETHROW) so the rethrow of pdlp_exception can be enabled only for debugging; locate the commented code around pdlp_exception in solve.cu and apply the change accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/pdlp/solve.cu`:
- Around line 1175-1183: The catch block around run_pdlp currently stores the
exception in pdlp_exception but discards it, losing diagnostics; update the
catch in solve.cu to extract and surface the exception message (from
pdlp_exception or std::current_exception()) — e.g., convert to a string via
std::rethrow_exception / try-catch and log it with the existing logging
facilities and/or set a more informative termination status on sol_pdlp (instead
of only pdlp_termination_status_t::NumericalError) before setting
*settings_pdlp.concurrent_halt = 1; ensure run_pdlp, pdlp_exception, sol_pdlp,
and settings_pdlp.concurrent_halt are referenced so callers receive meaningful
error info.
---
Nitpick comments:
In `@cpp/src/pdlp/solve.cu`:
- Around line 1190-1192: Remove the dead commented-out rethrow and replace the
vague TODO with a tracked issue reference and optional debug guard: create a
repository issue for the intermittent PDLP exception and update the comment to
include the issue ID (e.g., "// TODO(#<issue>): Intermittent PDLP exception
observed when running concurrent mode"), and either delete the commented block
that references pdlp_exception/printf or move that logic behind a
compile-time/debug flag (e.g., PDLP_DEBUG_RETHROW) so the rethrow of
pdlp_exception can be enabled only for debugging; locate the commented code
around pdlp_exception in solve.cu and apply the change accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 36088b3c-2460-43f0-af07-139729eea9b0
📒 Files selected for processing (1)
cpp/src/pdlp/solve.cu
Description
Fixes fails when pdlp throws exception in concurrent mode. Still needs evaluation on why pdlp fails.
Issue
Checklist