Reduce QP overheads#1140
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds native free-variable support for QP barrier solves, adjusts related barrier math and control flow, records free-variable indices in presolve metadata, conditions dense-column detection on Q presence/diagonality, times cuDSS initialization, removes one pre-barrier diagnostic, and introduces a ratio-based GMRES stop/update rule. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
| // Solve using barrier | ||
| lp_solution_t<i_t, f_t> barrier_solution(barrier_lp.num_rows, barrier_lp.num_cols); | ||
|
|
||
| // Clear variable pairs for QP |
There was a problem hiding this comment.
May I ask why we remove the check here?
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
cpp/src/barrier/iterative_refinement.hpp (1)
181-182: Make the early-stop ratio configurable.Line 181 hard-codes a solver-wide accuracy/performance tradeoff into a shared iterative-refinement path. A fixed
5.0may be fine for the profiling case that motivated this PR, but it is likely too aggressive for some matrices and too loose for others. Please plumb this through existing solver settings or a caller-supplied parameter instead of baking it into the header.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/barrier/iterative_refinement.hpp` around lines 181 - 182, Replace the hard-coded local constant f_t stop_ratio = 5.0 in iterative_refinement.hpp with a configurable parameter sourced from the solver settings or a caller-provided argument: add a new float-typed setting (or function parameter) named stop_ratio (or iterative_refinement_stop_ratio) to the solver config or to the iterative refinement function/method, use that variable in place of the literal when computing bnorm and the early-stop decision, keep the default value at 5.0 if not supplied, and update any callers or constructor that invoke the iterative refinement path to thread the new setting through.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/barrier/iterative_refinement.hpp`:
- Around line 365-380: The branch treating improvement_ratio == stop_ratio as
non-improvement is wrong; update the comparison logic in the iterative
refinement block (symbols: improvement_ratio, stop_ratio, best_residual, x_sav,
x, show_info, CUOPT_LOG_INFO) so that hitting the threshold counts as
improvement (e.g., use >= instead of >) or use an epsilon-based comparison for
floating-point stability (|improvement_ratio - stop_ratio| <= eps) to decide
improvement, and ensure the residual update and raft::copy for
best_residual/x_sav->x remain inside the "improved" branch while keeping the
logging/break behavior only for true stagnation/worsening cases.
---
Nitpick comments:
In `@cpp/src/barrier/iterative_refinement.hpp`:
- Around line 181-182: Replace the hard-coded local constant f_t stop_ratio =
5.0 in iterative_refinement.hpp with a configurable parameter sourced from the
solver settings or a caller-provided argument: add a new float-typed setting (or
function parameter) named stop_ratio (or iterative_refinement_stop_ratio) to the
solver config or to the iterative refinement function/method, use that variable
in place of the literal when computing bnorm and the early-stop decision, keep
the default value at 5.0 if not supplied, and update any callers or constructor
that invoke the iterative refinement path to thread the new setting through.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: a7b6ab20-ce6f-49c8-9326-46dd180e9846
📒 Files selected for processing (1)
cpp/src/barrier/iterative_refinement.hpp
| f_t improvement_ratio = best_residual / residual; | ||
| // Track best solution | ||
| if (residual < best_residual) { | ||
| if (improvement_ratio > stop_ratio) { | ||
| best_residual = residual; | ||
| raft::copy(x_sav.data(), x.data(), x.size(), x.stream()); | ||
| } else if (improvement_ratio < stop_ratio && improvement_ratio > 1.0) { | ||
| best_residual = residual; | ||
| raft::copy(x_sav.data(), x.data(), x.size(), x.stream()); | ||
| // Residual decreased, but not enough, continue | ||
| if (show_info) { | ||
| CUOPT_LOG_INFO("GMRES IR: improvement ratio %e is less than %e, breaking early", | ||
| improvement_ratio, | ||
| stop_ratio); | ||
| } | ||
| break; | ||
| } else { |
There was a problem hiding this comment.
Treat improvement_ratio == stop_ratio as improvement, not stagnation.
With the current > / < split, an exact threshold hit falls into the else branch, restores x_sav, and logs it as "residual increased or stagnated" even though the residual improved. That can discard a better iterate at the boundary. Use >= here at minimum, or an epsilon-based comparison if you want this threshold to be numerically stable.
🔧 Minimal fix
- if (improvement_ratio > stop_ratio) {
+ if (improvement_ratio >= stop_ratio) {
best_residual = residual;
raft::copy(x_sav.data(), x.data(), x.size(), x.stream());
- } else if (improvement_ratio < stop_ratio && improvement_ratio > 1.0) {
+ } else if (improvement_ratio > 1.0) {
best_residual = residual;
raft::copy(x_sav.data(), x.data(), x.size(), x.stream());As per coding guidelines, check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| f_t improvement_ratio = best_residual / residual; | |
| // Track best solution | |
| if (residual < best_residual) { | |
| if (improvement_ratio > stop_ratio) { | |
| best_residual = residual; | |
| raft::copy(x_sav.data(), x.data(), x.size(), x.stream()); | |
| } else if (improvement_ratio < stop_ratio && improvement_ratio > 1.0) { | |
| best_residual = residual; | |
| raft::copy(x_sav.data(), x.data(), x.size(), x.stream()); | |
| // Residual decreased, but not enough, continue | |
| if (show_info) { | |
| CUOPT_LOG_INFO("GMRES IR: improvement ratio %e is less than %e, breaking early", | |
| improvement_ratio, | |
| stop_ratio); | |
| } | |
| break; | |
| } else { | |
| f_t improvement_ratio = best_residual / residual; | |
| // Track best solution | |
| if (improvement_ratio >= stop_ratio) { | |
| best_residual = residual; | |
| raft::copy(x_sav.data(), x.data(), x.size(), x.stream()); | |
| } else if (improvement_ratio > 1.0) { | |
| best_residual = residual; | |
| raft::copy(x_sav.data(), x.data(), x.size(), x.stream()); | |
| // Residual decreased, but not enough, continue | |
| if (show_info) { | |
| CUOPT_LOG_INFO("GMRES IR: improvement ratio %e is less than %e, breaking early", | |
| improvement_ratio, | |
| stop_ratio); | |
| } | |
| break; | |
| } else { |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/barrier/iterative_refinement.hpp` around lines 365 - 380, The branch
treating improvement_ratio == stop_ratio as non-improvement is wrong; update the
comparison logic in the iterative refinement block (symbols: improvement_ratio,
stop_ratio, best_residual, x_sav, x, show_info, CUOPT_LOG_INFO) so that hitting
the threshold counts as improvement (e.g., use >= instead of >) or use an
epsilon-based comparison for floating-point stability (|improvement_ratio -
stop_ratio| <= eps) to decide improvement, and ensure the residual update and
raft::copy for best_residual/x_sav->x remain inside the "improved" branch while
keeping the logging/break behavior only for true stagnation/worsening cases.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/barrier/barrier.cu`:
- Around line 2310-2340: The free-variable branch currently sets diag to 0
whenever q_jj > 0 which allows tiny positive Q(j,j) to produce near-zero diag;
change the lambda used in the cub::DeviceTransform::Transform (the one capturing
free_var_reg in the data.Q.n > 0 && data.Q_diagonal case) to floor the returned
value at free_var_reg (e.g., return max(q_jj, free_var_reg) or return (q_jj >
free_var_reg) ? q_jj : free_var_reg) so diag for free variables is never below
free_var_reg; also ensure the other free-variable-only lambda (the data.Q.n == 0
path) similarly returns at least free_var_reg for is_free entries.
- Around line 3596-3606: The initial mu computation (the one that divides by n +
num_upper_bounds before the iteration loop) doesn't account for native free
variables set via presolve_info.free_variable_indices and data.n_free_vars;
update that initial mu calculation to use the same adjusted divisor as
compute_mu()/compute_target_mu() by subtracting data.n_free_vars (i.e., use n +
num_upper_bounds - data.n_free_vars) when presolve_info.free_variable_indices is
non-empty, and ensure data.n_free_vars and d_is_free_ are initialized before
this initial mu is computed so the first iteration uses the same barrier measure
as later compute_mu()/compute_target_mu() calls.
- Around line 3081-3083: The calculation of mu_aff and subsequent mu division
must guard against mu_denom == 0 (mu_denom = x.size() + n_upper_bounds -
n_free_vars) to avoid division-by-zero when all variables are free; add an
explicit fast-path: compute mu_denom as now, then if mu_denom <= eps (use a
small f_t epsilon like 0 or machine epsilon), set mu_aff = static_cast<f_t>(0)
and mu = static_cast<f_t>(0), set any mu ratio (e.g., mu_aff_over_mu) to a safe
default (1 or 0 depending on downstream expectations) and skip the
mu-aff/mu-based updates, otherwise perform the existing divisions. Apply the
identical guard and handling for the other occurrence around the mu computation
block (the second spot noted in the comment).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 94fd7d1f-a999-4dab-a9dc-d5d097ed0af6
📒 Files selected for processing (3)
cpp/src/barrier/barrier.cucpp/src/dual_simplex/presolve.cppcpp/src/dual_simplex/presolve.hpp
| f_t mu_denom = static_cast<f_t>(data.x.size()) + static_cast<f_t>(data.n_upper_bounds); | ||
| mu_denom -= static_cast<f_t>(data.n_free_vars); | ||
| mu_aff = complementarity_aff_sum / mu_denom; |
There was a problem hiding this comment.
Handle the zero-complementarity case before computing mu.
With native free-variable QPs, x.size() + n_upper_bounds - n_free_vars can be zero when every variable is free and there are no finite upper bounds. In that case both mu_aff and mu divide by zero, and the subsequent mu_aff / mu ratio is undefined as well. This needs a dedicated fast path rather than just subtracting n_free_vars.
💡 Suggested guard
- f_t mu_denom = static_cast<f_t>(data.x.size()) + static_cast<f_t>(data.n_upper_bounds);
- mu_denom -= static_cast<f_t>(data.n_free_vars);
- mu_aff = complementarity_aff_sum / mu_denom;
+ f_t mu_denom = static_cast<f_t>(data.x.size()) + static_cast<f_t>(data.n_upper_bounds) -
+ static_cast<f_t>(data.n_free_vars);
+ if (mu_denom == f_t(0)) {
+ mu_aff = f_t(0);
+ sigma = f_t(0);
+ new_mu = f_t(0);
+ return;
+ }
+ mu_aff = complementarity_aff_sum / mu_denom;- f_t mu_denom = static_cast<f_t>(data.x.size()) + static_cast<f_t>(data.n_upper_bounds);
- mu_denom -= static_cast<f_t>(data.n_free_vars); // free vars don't contribute to mu
+ f_t mu_denom = static_cast<f_t>(data.x.size()) + static_cast<f_t>(data.n_upper_bounds) -
+ static_cast<f_t>(data.n_free_vars);
+ if (mu_denom == f_t(0)) {
+ mu = f_t(0);
+ return;
+ }As per coding guidelines, "Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks."
Also applies to: 3307-3316
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/barrier/barrier.cu` around lines 3081 - 3083, The calculation of
mu_aff and subsequent mu division must guard against mu_denom == 0 (mu_denom =
x.size() + n_upper_bounds - n_free_vars) to avoid division-by-zero when all
variables are free; add an explicit fast-path: compute mu_denom as now, then if
mu_denom <= eps (use a small f_t epsilon like 0 or machine epsilon), set mu_aff
= static_cast<f_t>(0) and mu = static_cast<f_t>(0), set any mu ratio (e.g.,
mu_aff_over_mu) to a safe default (1 or 0 depending on downstream expectations)
and skip the mu-aff/mu-based updates, otherwise perform the existing divisions.
Apply the identical guard and handling for the other occurrence around the mu
computation block (the second spot noted in the comment).
Description
Issue
Checklist