Skip to content

Reduce memory footprint of cuPDLPx#1153

Open
Bubullzz wants to merge 31 commits intoNVIDIA:mainfrom
Bubullzz:cupdlpx_memory_reduce
Open

Reduce memory footprint of cuPDLPx#1153
Bubullzz wants to merge 31 commits intoNVIDIA:mainfrom
Bubullzz:cupdlpx_memory_reduce

Conversation

@Bubullzz
Copy link
Copy Markdown

This PR removes most of the unused/one-time-used device_vectors in cuPDLPx to reduce the memory footprint. It allows to run bigger problems without running out of memory

@Bubullzz Bubullzz requested a review from a team as a code owner April 28, 2026 09:34
@Bubullzz Bubullzz requested review from aliceb-nv and mlubin April 28, 2026 09:34
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 28, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Bubullzz Bubullzz added the improvement Improves an existing functionality label Apr 28, 2026
@Bubullzz
Copy link
Copy Markdown
Author

/ok test 5e9d12b

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds cuPDLPx restart-awareness: redirects cuSPARSE CSR descriptors to original problem buffers, propagates an is_cupdlpx flag through solver and saddle-point initialization, and conditionally allocates/frees several device buffers and averaged-state structures for restart scenarios.

Changes

Cohort / File(s) Summary
CuSPARSE descriptor redirection
cpp/src/pdlp/cusparse_view.hpp, cpp/src/pdlp/cusparse_view.cu
Added redirect_rows_and_cols(const problem_t<i_t,f_t>&) to rebind CSR offsets/indices/value pointers from scaled problem vectors to the original problem; for f_t=double, optionally redirect mixed-precision value pointers to existing FP32 buffers when mixed precision is enabled. Also adjusted restart/initialization sizing to use the saddle-point state's actual primal_gradient size.
Saddle-point state
cpp/src/pdlp/saddle_point.hpp, cpp/src/pdlp/saddle_point.cu
Extended constructor with an is_cupdlpx parameter (default false); when true, initialize primal_gradient_ (and related buffers) with size 0 to avoid allocating unnecessary buffers during cuPDLPx restart.
PDLP core & solver wiring
cpp/src/pdlp/pdlp.cuh, cpp/src/pdlp/pdlp.cu
Added is_cupdlpx_ member; compute restart flag at construction and avoid allocating unscaled average-solution buffers when cuPDLPx restart is active without warm-start data. At run time, call descriptor redirection and resize duplicated structural vectors to zero; add a lighten_problem_for_pdlp cleanup pass.
Restart & termination logic
cpp/src/pdlp/restart_strategy/pdlp_restart_strategy.cu, cpp/src/pdlp/termination_strategy/infeasibility_information.cu
Conditionally set averaging solution dimensions to zero when hyper_params.never_restart_to_average/restart logic applies; allocate slack buffers only when both cuPDLPx restart is active and infeasibility detection is enabled.
PDHG & constructor propagation
cpp/src/pdlp/pdhg.cu
Pass is_cupdlpx_restart<i_t,f_t>(hyper_params) into current_saddle_point_state_ so saddle-point state construction is restart-aware.
Resource management / scaling vectors
cpp/src/pdlp/initial_scaling_strategy/initial_scaling.cu
Immediately free per-iteration GPU scaling vectors by resizing iteration_constraint_matrix_scaling_ and iteration_variable_scaling_ to zero after computing scaling vectors.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.69% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the primary objective of the changeset: reducing memory consumption in cuPDLPx by eliminating unused device vectors.
Description check ✅ Passed The description is directly related to the changeset, explaining that the PR removes unused/one-time-used device_vectors to reduce memory footprint and enable larger problems.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/pdlp/saddle_point.cu (1)

20-53: ⚠️ Potential issue | 🔴 Critical

Add guard to prevent creating unsafe view of zero-sized primal gradient buffer in cuPDLPx mode.

In cpp/src/pdlp/cusparse_view.cu, the constructor unconditionally creates a view with op_problem_scaled.n_variables elements from the primal gradient buffer (line 410–411). However, when is_cupdlpx is true, the saddle point state allocates primal_gradient_ with size 0. This creates an unsafe view that could access out-of-bounds memory.

The cudaMemsetAsync call on the zero-sized buffer itself is safe (no-op), but the view creation is the actual hazard. Guard line 410–411 to skip view creation when primal_gradient_ is empty, or add an assertion confirming that the buffer is non-empty before creating the view.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/saddle_point.cu` around lines 20 - 53, The constructor
saddle_point_state_t currently creates primal_gradient_ with size 0 when
is_cupdlpx is true, but cusparse_view.cu later unconditionally builds a view of
length op_problem_scaled.n_variables from that buffer; add a guard around the
view creation in cusparse_view.cu (or assert buffer non-empty) so you only
create the view when saddle_point_state_t::primal_gradient_.size() > 0 (i.e.,
when is_cupdlpx is false), preventing an unsafe zero-sized-buffer view;
reference the symbols primal_gradient_, saddle_point_state_t, and the view
creation that uses op_problem_scaled.n_variables to locate the change.
🧹 Nitpick comments (1)
cpp/src/mip_heuristics/problem/problem.cuh (1)

154-154: Keep pdlp_lighten internal if possible.

This reads like solver-internal cleanup rather than a stable problem_t capability. Exposing it publicly makes it easier for callers to put the object into a partially-invalid state.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/problem/problem.cuh` at line 154, The pdlp_lighten
declaration in the public header exposes an internal solver cleanup operation
(pdlp_lighten) on problem_t; remove this from the .cuh public header and instead
make it internal by moving its declaration/definition into the corresponding
implementation file (or place it in an anonymous namespace / mark static) so
only internal code operating on problem_t can call it; update any internal
callers to include the private implementation file and keep the symbol out of
the public API.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/pdlp/pdlp.cu`:
- Around line 2268-2278: The verbose printing accesses
problem_ptr->combined_bounds after problem_ptr->pdlp_lighten() frees it, causing
a use-after-free; move the PDLP_VERBOSE_MODE verbose block (the call to
print_problem_info<f_t> that uses problem_ptr->combined_bounds) to before
calling op_problem_scaled_.pdlp_lighten() and problem_ptr->pdlp_lighten(), or
alternatively postpone calling problem_ptr->pdlp_lighten() until after the
verbose print so that combined_bounds remains valid; update the code around the
calls to pdhg_solver_.get_cusparse_view().redirect_rows_and_cols(*problem_ptr),
op_problem_scaled_.pdlp_lighten(), and problem_ptr->pdlp_lighten() to ensure
print_problem_info<f_t> runs while combined_bounds is still allocated.
- Around line 124-127: Add fail-fast invariants to the warm-start validation so
copying into zero-sized average buffers is prevented: in the warm-start
validation block (after the existing validation around initial_*), add
cuopt_expects checks that when unscaled_primal_avg_solution_.size() == 0 then
initial_primal_average_.empty() (and similarly when
unscaled_dual_avg_solution_.size() == 0 then initial_dual_average_.empty()), and
fail otherwise; reference the buffers unscaled_primal_avg_solution_ and
unscaled_dual_avg_solution_ and the inputs initial_primal_average_ and
initial_dual_average_ so the logic short-circuits before any memcpy/copy occurs.

---

Outside diff comments:
In `@cpp/src/pdlp/saddle_point.cu`:
- Around line 20-53: The constructor saddle_point_state_t currently creates
primal_gradient_ with size 0 when is_cupdlpx is true, but cusparse_view.cu later
unconditionally builds a view of length op_problem_scaled.n_variables from that
buffer; add a guard around the view creation in cusparse_view.cu (or assert
buffer non-empty) so you only create the view when
saddle_point_state_t::primal_gradient_.size() > 0 (i.e., when is_cupdlpx is
false), preventing an unsafe zero-sized-buffer view; reference the symbols
primal_gradient_, saddle_point_state_t, and the view creation that uses
op_problem_scaled.n_variables to locate the change.

---

Nitpick comments:
In `@cpp/src/mip_heuristics/problem/problem.cuh`:
- Line 154: The pdlp_lighten declaration in the public header exposes an
internal solver cleanup operation (pdlp_lighten) on problem_t; remove this from
the .cuh public header and instead make it internal by moving its
declaration/definition into the corresponding implementation file (or place it
in an anonymous namespace / mark static) so only internal code operating on
problem_t can call it; update any internal callers to include the private
implementation file and keep the symbol out of the public API.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6de370d2-d055-40a2-91b7-1c54fcbb34cc

📥 Commits

Reviewing files that changed from the base of the PR and between fd40c9a and 5e9d12b.

📒 Files selected for processing (12)
  • cpp/src/mip_heuristics/problem/problem.cu
  • cpp/src/mip_heuristics/problem/problem.cuh
  • cpp/src/pdlp/cusparse_view.cu
  • cpp/src/pdlp/cusparse_view.hpp
  • cpp/src/pdlp/initial_scaling_strategy/initial_scaling.cu
  • cpp/src/pdlp/pdhg.cu
  • cpp/src/pdlp/pdlp.cu
  • cpp/src/pdlp/pdlp.cuh
  • cpp/src/pdlp/restart_strategy/pdlp_restart_strategy.cu
  • cpp/src/pdlp/saddle_point.cu
  • cpp/src/pdlp/saddle_point.hpp
  • cpp/src/pdlp/termination_strategy/infeasibility_information.cu

Comment thread cpp/src/pdlp/pdlp.cu Outdated
Comment thread cpp/src/pdlp/pdlp.cu Outdated
@Bubullzz Bubullzz added the non-breaking Introduces a non-breaking change label Apr 28, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
cpp/src/pdlp/pdlp.cu (3)

210-211: Inconsistent sizing logic between buffers and termination strategy.

Lines 132-139 allocate full-size average buffers when warm-start is populated in cuPDLPx mode, but the termination strategy here unconditionally uses size 0 for cuPDLPx. This mismatch relies on an implicit invariant that cuPDLPx mode always has never_restart_to_average=true.

Consider adding explicit validation in the constructor to enforce this invariant and prevent subtle bugs:

if (is_cupdlpx_) {
  cuopt_expects(settings_.hyper_params.never_restart_to_average,
                error_type_t::ValidationError,
                "cuPDLPx mode requires never_restart_to_average=true");
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/pdlp.cu` around lines 210 - 211, The allocation vs termination
mismatch for cuPDLPx is unsafe: ensure the constructor validates that when
is_cupdlpx_ is true the invariant never_restart_to_average is enforced; in the
class/constructor that initializes is_cupdlpx_ add a guard checking
settings_.hyper_params.never_restart_to_average and raise a ValidationError (use
cuopt_expects or existing validation error pattern) with a clear message like
"cuPDLPx mode requires never_restart_to_average=true" so buffers allocated for
warm-start match the termination logic using the zero sizes.

2473-2476: Redundant cleanup calls—already executed unconditionally earlier.

Lines 2289-2290 already call lighten_problem_for_pdlp on both op_problem_scaled_ and *problem_ptr unconditionally. This conditional block performs the same resize-to-0 operations, which are no-ops since the vectors are already empty.

♻️ Suggested removal
-  if (!settings_.per_constraint_residual) {
-    lighten_problem_for_pdlp(op_problem_scaled_, stream_view_);
-    lighten_problem_for_pdlp(*problem_ptr, stream_view_);
-  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/pdlp.cu` around lines 2473 - 2476, The conditional block
redundantly calls lighten_problem_for_pdlp on op_problem_scaled_ and
*problem_ptr when settings_.per_constraint_residual is false, but these same
calls were already made unconditionally earlier; remove the conditional block
containing lighten_problem_for_pdlp(op_problem_scaled_, stream_view_) and
lighten_problem_for_pdlp(*problem_ptr, stream_view_) to avoid duplicate no-op
cleanup and keep only the earlier unconditional calls.

2280-2290: Memory optimization looks correct, but modifies the original problem.

The redirect-before-free sequence is correct. However, lighten_problem_for_pdlp(*problem_ptr, ...) at line 2290 modifies the caller's problem object by freeing variable_types, related_variables_offsets, and integer_fixed_variable_map.

If callers expect the original problem to remain unmodified after solving (e.g., for re-solving or inspection), this could be surprising. Consider documenting this behavior or making it opt-in.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/pdlp.cu` around lines 2280 - 2290, The current call lightening
the device vectors (lighten_problem_for_pdlp) is mutating the caller's Problem
via *problem_ptr after redirect_rows_and_cols; to avoid surprising callers,
change to operate on a local copy or make the destructive behavior opt-in:
either (A) replace lighten_problem_for_pdlp(*problem_ptr, stream_view_) with a
call on a shallow/deep copy (e.g., create Problem temp = *problem_ptr;
lighten_problem_for_pdlp(temp, stream_view_);) so the original stays intact, or
(B) add a flag parameter to lighten_problem_for_pdlp (e.g., bool
preserve_original) and invoke it with preserve_original=true so it copies
internally before freeing
variable_types/related_variables_offsets/integer_fixed_variable_map; update call
sites (pd hg_solver_.get_cusparse_view().redirect_rows_and_cols,
op_problem_scaled_ use remains) and document the new behavior in the function
comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cpp/src/pdlp/pdlp.cu`:
- Around line 210-211: The allocation vs termination mismatch for cuPDLPx is
unsafe: ensure the constructor validates that when is_cupdlpx_ is true the
invariant never_restart_to_average is enforced; in the class/constructor that
initializes is_cupdlpx_ add a guard checking
settings_.hyper_params.never_restart_to_average and raise a ValidationError (use
cuopt_expects or existing validation error pattern) with a clear message like
"cuPDLPx mode requires never_restart_to_average=true" so buffers allocated for
warm-start match the termination logic using the zero sizes.
- Around line 2473-2476: The conditional block redundantly calls
lighten_problem_for_pdlp on op_problem_scaled_ and *problem_ptr when
settings_.per_constraint_residual is false, but these same calls were already
made unconditionally earlier; remove the conditional block containing
lighten_problem_for_pdlp(op_problem_scaled_, stream_view_) and
lighten_problem_for_pdlp(*problem_ptr, stream_view_) to avoid duplicate no-op
cleanup and keep only the earlier unconditional calls.
- Around line 2280-2290: The current call lightening the device vectors
(lighten_problem_for_pdlp) is mutating the caller's Problem via *problem_ptr
after redirect_rows_and_cols; to avoid surprising callers, change to operate on
a local copy or make the destructive behavior opt-in: either (A) replace
lighten_problem_for_pdlp(*problem_ptr, stream_view_) with a call on a
shallow/deep copy (e.g., create Problem temp = *problem_ptr;
lighten_problem_for_pdlp(temp, stream_view_);) so the original stays intact, or
(B) add a flag parameter to lighten_problem_for_pdlp (e.g., bool
preserve_original) and invoke it with preserve_original=true so it copies
internally before freeing
variable_types/related_variables_offsets/integer_fixed_variable_map; update call
sites (pd hg_solver_.get_cusparse_view().redirect_rows_and_cols,
op_problem_scaled_ use remains) and document the new behavior in the function
comment.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 67ac28f7-ae73-4132-b0ff-2fa984c912c9

📥 Commits

Reviewing files that changed from the base of the PR and between 7c2e414 and 84daee4.

📒 Files selected for processing (2)
  • cpp/src/pdlp/cusparse_view.cu
  • cpp/src/pdlp/pdlp.cu
🚧 Files skipped from review as they are similar to previous changes (1)
  • cpp/src/pdlp/cusparse_view.cu

Copy link
Copy Markdown
Contributor

@mlubin mlubin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Do you have any concrete numbers in the reduction of peak memory usage?

Comment thread cpp/src/pdlp/pdlp.cuh Outdated
Comment thread cpp/src/pdlp/saddle_point.hpp Outdated
Comment thread cpp/src/pdlp/pdlp.cu
Comment thread cpp/src/pdlp/pdlp.cu Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/pdlp/pdlp.cu (1)

205-214: ⚠️ Potential issue | 🔴 Critical

Zero-sized average_termination_strategy_ is still consumed later.

check_termination() only skips evaluate_termination_criteria() when never_restart_to_average is set. It still unconditionally reads get_termination_status(0) at Line 1028, checks all_optimal_status() at Lines 1096 and 1151, and uses get_convergence_information() / fill_return_problem_solution() throughout the average branches below. Constructing the average strategy with {0, 0} therefore removes the storage without removing the uses.

Suggested fix
     average_termination_strategy_{handle_ptr_,
                                   op_problem,
                                   op_problem_scaled_,
                                   average_op_problem_evaluation_cusparse_view_,
                                   pdhg_solver_.get_cusparse_view(),
-                                  settings_.hyper_params.never_restart_to_average ? 0 : primal_size_h_,
-                                  settings_.hyper_params.never_restart_to_average ? 0 : dual_size_h_,
+                                  primal_size_h_,
+                                  dual_size_h_,
                                   initial_scaling_strategy_,
                                   settings_,
                                   climber_strategies_},

As per coding guidelines, "Flag algorithm correctness errors including logic errors in optimization algorithms" and "Flag invalid memory access patterns including out-of-bounds, use-after-free, and host/device confusion."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/pdlp.cu` around lines 205 - 214, The average termination
strategy is being constructed with zero sizes when
settings_.hyper_params.never_restart_to_average is true, but later code
unconditionally accesses it (check_termination(), get_termination_status(0),
all_optimal_status(), get_convergence_information(),
fill_return_problem_solution), causing logic/memory errors; fix by not
constructing average_termination_strategy_ when never_restart_to_average is true
and protect all usages of the average strategy by checking
settings_.hyper_params.never_restart_to_average (or that
primal_size_h_/dual_size_h_ > 0) before calling check_termination(),
get_termination_status(), all_optimal_status(), get_convergence_information(),
or fill_return_problem_solution, or alternatively make those accessors tolerate
an empty/absent average strategy—apply the same conditional guard around every
"average" branch that currently assumes the strategy exists.
🧹 Nitpick comments (1)
cpp/src/pdlp/pdlp.cu (1)

2288-2290: Make the PDLP cleanup condition consistent.

These helper calls already free the PDLP-only vectors unconditionally, so the later if (!settings_.per_constraint_residual) block at Lines 2473-2476 no longer changes behavior. If per_constraint_residual=true is supposed to preserve those buffers, the cleanup is happening too early; otherwise the later branch is just dead code.

Suggested refactor
-  // Remove unused device vectors in PDLP
-  lighten_problem_for_pdlp(op_problem_scaled_, stream_view_);
-  lighten_problem_for_pdlp(*problem_ptr, stream_view_);
+  if (!settings_.per_constraint_residual) {
+    // Remove unused device vectors in PDLP
+    lighten_problem_for_pdlp(op_problem_scaled_, stream_view_);
+    lighten_problem_for_pdlp(*problem_ptr, stream_view_);
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/pdlp.cu` around lines 2288 - 2290, The PDLP-only buffers are
being freed unconditionally by the calls to
lighten_problem_for_pdlp(op_problem_scaled_, stream_view_) and
lighten_problem_for_pdlp(*problem_ptr, stream_view_), making the later branch
guarded by settings_.per_constraint_residual redundant or incorrect; update the
cleanup so it is consistent: either move those two lighten_problem_for_pdlp
calls inside the same if (!settings_.per_constraint_residual) condition so
buffers are preserved when per_constraint_residual==true, or remove the later
dead branch entirely if unconditional freeing is intended—adjust whichever code
path you choose and ensure references to op_problem_scaled_, problem_ptr,
lighten_problem_for_pdlp, and settings_.per_constraint_residual are updated
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/pdlp/restart_strategy/pdlp_restart_strategy.cu`:
- Around line 94-101: The containers avg_duality_gap_ and current_duality_gap_
are being constructed with size 0 when hyper_params.never_restart_to_average is
true, but update_distance() and the trust-region restart logic still write/read
them; fix by always allocating these containers with the full primal_size and
dual_size (i.e., remove the conditional zero-sizing) so avg_duality_gap_ and
current_duality_gap_ remain valid, or alternatively guard all writes/reads to
these containers behind the hyper_params.never_restart_to_average check; update
the constructor calls that currently pass "hyper_params.never_restart_to_average
? 0 : primal_size" / "dual_size" so they allocate the real sizes, and ensure
update_distance() and the trust-region restart computation paths use those
allocated buffers safely.

---

Outside diff comments:
In `@cpp/src/pdlp/pdlp.cu`:
- Around line 205-214: The average termination strategy is being constructed
with zero sizes when settings_.hyper_params.never_restart_to_average is true,
but later code unconditionally accesses it (check_termination(),
get_termination_status(0), all_optimal_status(), get_convergence_information(),
fill_return_problem_solution), causing logic/memory errors; fix by not
constructing average_termination_strategy_ when never_restart_to_average is true
and protect all usages of the average strategy by checking
settings_.hyper_params.never_restart_to_average (or that
primal_size_h_/dual_size_h_ > 0) before calling check_termination(),
get_termination_status(), all_optimal_status(), get_convergence_information(),
or fill_return_problem_solution, or alternatively make those accessors tolerate
an empty/absent average strategy—apply the same conditional guard around every
"average" branch that currently assumes the strategy exists.

---

Nitpick comments:
In `@cpp/src/pdlp/pdlp.cu`:
- Around line 2288-2290: The PDLP-only buffers are being freed unconditionally
by the calls to lighten_problem_for_pdlp(op_problem_scaled_, stream_view_) and
lighten_problem_for_pdlp(*problem_ptr, stream_view_), making the later branch
guarded by settings_.per_constraint_residual redundant or incorrect; update the
cleanup so it is consistent: either move those two lighten_problem_for_pdlp
calls inside the same if (!settings_.per_constraint_residual) condition so
buffers are preserved when per_constraint_residual==true, or remove the later
dead branch entirely if unconditional freeing is intended—adjust whichever code
path you choose and ensure references to op_problem_scaled_, problem_ptr,
lighten_problem_for_pdlp, and settings_.per_constraint_residual are updated
accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2c589a23-685a-46f9-9498-ad9162723f72

📥 Commits

Reviewing files that changed from the base of the PR and between 84daee4 and edb13ae.

📒 Files selected for processing (2)
  • cpp/src/pdlp/pdlp.cu
  • cpp/src/pdlp/restart_strategy/pdlp_restart_strategy.cu

Comment thread cpp/src/pdlp/restart_strategy/pdlp_restart_strategy.cu Outdated
@Bubullzz
Copy link
Copy Markdown
Author

/ok test 195ca23

@Bubullzz Bubullzz requested a review from mlubin April 30, 2026 11:36
Comment thread cpp/src/pdlp/cusparse_view.hpp Outdated

// Redirects the vectors of rows and columns from op_problem_scaled_ to the original problem so
// that they can be freed
void redirect_rows_and_cols(const problem_t<i_t, f_t>& original_problem);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just move ?

This is very specific to cusparse pointers right? I think the naming should reflect that.

Copy link
Copy Markdown
Author

@Bubullzz Bubullzz Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need original_problem to own these pointers. Here we are only borrowing it from scaled problem so we can free duplicated data, this is why I don't think we should use move here.
I will rename it to reflect its link with cusarse ptr

Comment thread cpp/src/pdlp/pdlp.cuh Outdated
Comment thread cpp/src/pdlp/saddle_point.hpp Outdated
@Bubullzz Bubullzz requested a review from rg20 April 30, 2026 14:36
@Bubullzz
Copy link
Copy Markdown
Author

/ok to test 851be7c

Copy link
Copy Markdown
Contributor

@mlubin mlubin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but don't merge without @rg20's approval.

@Bubullzz
Copy link
Copy Markdown
Author

Bubullzz commented May 4, 2026

/ok to test 6981c1f

@Bubullzz
Copy link
Copy Markdown
Author

Bubullzz commented May 4, 2026

/ok to test 8f9a937

@Bubullzz
Copy link
Copy Markdown
Author

Bubullzz commented May 4, 2026

/ok to test 29bd4ca

1 similar comment
@Bubullzz
Copy link
Copy Markdown
Author

Bubullzz commented May 4, 2026

/ok to test 29bd4ca

@Bubullzz
Copy link
Copy Markdown
Author

Bubullzz commented May 4, 2026

/ok to test 72afe0a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants