Skip to content

[FEA] Heuristics use work estimates instead of time limits (#1069)#1122

Open
Pritiks23 wants to merge 2 commits intoNVIDIA:mainfrom
Pritiks23:feature/heuristics-work-estimate
Open

[FEA] Heuristics use work estimates instead of time limits (#1069)#1122
Pritiks23 wants to merge 2 commits intoNVIDIA:mainfrom
Pritiks23:feature/heuristics-work-estimate

Conversation

@Pritiks23
Copy link
Copy Markdown

@Pritiks23 Pritiks23 commented Apr 18, 2026

Description

Refactored all heuristic routines to use work estimates (iteration/move limits) instead of time limits. This ensures deterministic behavior and supports both deterministic and nondeterministic solver modes. All call sites updated; no compilation errors; logic is robust and backward compatible.

Issue

Closes #1069

Checklist

  • [Y ] I am familiar with the Contributing Guidelines.
  • Testing
    • [ Y] New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • [ Y] The documentation is up to date with these changes
    • Added new documentation
    • NA

@Pritiks23 Pritiks23 requested a review from a team as a code owner April 18, 2026 03:12
@Pritiks23 Pritiks23 requested review from mlubin and nguidotti April 18, 2026 03:12
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 18, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 18, 2026

📝 Walkthrough

Walkthrough

Refactors local-search control from time-limit-driven to work-estimate-driven. Function signatures for run_best_local_search and run_random_local_search changed to accept work-estimate parameters; implementations now track and check a work counter and callers pass explicit work limits and mode flags.

Changes

Cohort / File(s) Summary
Local Search Core
cpp/src/routing/local_search/local_search.cuh, cpp/src/routing/local_search/local_search.cu
Changed signatures: run_best_local_search(...) now takes (use_work_estimate, work_estimate_limit); run_random_local_search(...) now takes (use_work_estimate, work_estimate_limit) with new defaults. Added work-tracking state (work_counter, work_limit, use_work_estimate) and helpers (reset_work_counter(), check_work_estimate(), increment_work_counter(...)). Replaced time-limit checks with work-estimate checks and added work-counter accounting throughout loops.
Adapter Layer
cpp/src/routing/adapters/adapted_generator.cu, cpp/src/routing/adapters/adapted_modifier.cu
Updated local-search invocations to pass work-estimate mode and explicit limits (e.g., true, 1000 or computed work_estimate_limit). Removed start_timer() usage and time-limit gating; converted logic to compute/pass work-estimate limits (fallbacks like 10000).
GES Module
cpp/src/routing/ges/execute_insertion.cu, cpp/src/routing/ges/squeeze.cu
Replaced calls to run_random_local_search(..., false) with run_random_local_search(..., true, 1000) and switched run_best_local_search calls from time-limit boolean to (use_work_estimate, work_estimate_limit, run_cycle_finder). Removed start_timer(remaining_time()) and adapted surrounding control flow to use work-estimate parameters.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: refactoring heuristics to use work estimates instead of time limits, which is the core objective of this pull request.
Description check ✅ Passed The description is related to the changeset, explaining the refactoring of heuristic routines to use work estimates instead of time limits and mentions updated call sites.
Linked Issues check ✅ Passed The pull request addresses the primary objective of issue #1069: replacing wall-clock time limits in heuristics with work estimates to ensure deterministic behavior and support both solver modes.
Out of Scope Changes check ✅ Passed All changes are focused on replacing time-limit-based termination with work-estimate-based termination across heuristic routines, directly addressing the linked issue scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cpp/src/routing/adapters/adapted_modifier.cu (1)

45-59: ⚠️ Potential issue | 🟠 Major

improve() no longer honors its time_limit input.

This now always runs run_best_local_search(..., true, 10000, ...), so callers get the same search budget regardless of the time_limit they pass. That is an observable behavior change for any path using improve() to bound latency. Please either translate time_limit into a work budget here or remove/rename the parameter in the same change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/routing/adapters/adapted_modifier.cu` around lines 45 - 59, The
improve() method in adapted_modifier_t (function improve(...)) ignores its
time_limit parameter and hardcodes the work budget by calling
resource.ls.run_best_local_search(..., true, 10000, ...); update this to
translate time_limit into an appropriate work budget (or use it to select a work
estimate) before calling run_best_local_search, e.g., compute a work_limit from
time_limit and pass it instead of 10000, or if you intend to remove timing
control, remove/rename the time_limit parameter and update callers; ensure the
change touches adapted_modifier_t::improve and the run_best_local_search call so
callers' latency bounds are respected.
cpp/src/routing/local_search/local_search.cu (1)

338-367: ⚠️ Potential issue | 🟠 Major

Add work counter increments to run_random_local_search().

The work estimate check at line 364 will never trigger because work_counter is reset at line 344 and never incremented anywhere in run_random_local_search() or its callees (find_insertions(), populate_random_moves(), perform_moves()). This makes the use_work_estimate_ and work_estimate_limit_ parameters inert on this code path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/routing/local_search/local_search.cu` around lines 338 - 367, The
work_counter is reset by reset_work_counter() at start of
run_random_local_search() but never incremented, so use_work_estimate_ is
effectively ignored; add increments to the shared work counter (the same counter
checked by check_work_estimate()) at sensible points in
run_random_local_search() — e.g., after heavy operations like
calculate_route_compatibility(sol), after find_insertions<i_t,f_t,REQUEST>(...),
after populate_random_moves(sol), and inside the loop/logic that applies moves
(perform_moves or wherever move application is done) so that work_counter
advances toward work_limit_ and the work_estimate_reached boolean can become
true; reference the symbols reset_work_counter, work_counter,
check_work_estimate, work_limit, use_work_estimate,
calculate_route_compatibility, find_insertions, populate_random_moves, and
perform_moves when adding these increments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/routing/ges/squeeze.cu`:
- Around line 246-253: The fixed work_estimate_limit (work_estimate_limit =
10000) passed to local_search_ptr_->run_best_local_search makes squeeze
local-search calls ignore the remaining solver budget; replace the hardcoded
constant by deriving a budget from the solver's remaining budget (or add a
remaining-time escape hatch) and pass that computed work estimate into
run_best_local_search, ensuring solution_ptr->global_runtime_checks is still
invoked to enable deterministic behavior; apply the same change to the
equivalent call sites around the other noted regions (the other squeeze calls at
the later blocks) so all squeeze-time local-search calls respect the remaining
solver budget.

---

Outside diff comments:
In `@cpp/src/routing/adapters/adapted_modifier.cu`:
- Around line 45-59: The improve() method in adapted_modifier_t (function
improve(...)) ignores its time_limit parameter and hardcodes the work budget by
calling resource.ls.run_best_local_search(..., true, 10000, ...); update this to
translate time_limit into an appropriate work budget (or use it to select a work
estimate) before calling run_best_local_search, e.g., compute a work_limit from
time_limit and pass it instead of 10000, or if you intend to remove timing
control, remove/rename the time_limit parameter and update callers; ensure the
change touches adapted_modifier_t::improve and the run_best_local_search call so
callers' latency bounds are respected.

In `@cpp/src/routing/local_search/local_search.cu`:
- Around line 338-367: The work_counter is reset by reset_work_counter() at
start of run_random_local_search() but never incremented, so use_work_estimate_
is effectively ignored; add increments to the shared work counter (the same
counter checked by check_work_estimate()) at sensible points in
run_random_local_search() — e.g., after heavy operations like
calculate_route_compatibility(sol), after find_insertions<i_t,f_t,REQUEST>(...),
after populate_random_moves(sol), and inside the loop/logic that applies moves
(perform_moves or wherever move application is done) so that work_counter
advances toward work_limit_ and the work_estimate_reached boolean can become
true; reference the symbols reset_work_counter, work_counter,
check_work_estimate, work_limit, use_work_estimate,
calculate_route_compatibility, find_insertions, populate_random_moves, and
perform_moves when adding these increments.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 62ba6f4a-b543-4310-9773-ce48b86bd6d1

📥 Commits

Reviewing files that changed from the base of the PR and between 6b0d8a4 and 7e6004b.

📒 Files selected for processing (6)
  • cpp/src/routing/adapters/adapted_generator.cu
  • cpp/src/routing/adapters/adapted_modifier.cu
  • cpp/src/routing/ges/execute_insertion.cu
  • cpp/src/routing/ges/squeeze.cu
  • cpp/src/routing/local_search/local_search.cu
  • cpp/src/routing/local_search/local_search.cuh

Comment on lines +246 to +253
// Use work estimate instead of time limit for determinism
solution_ptr->global_runtime_checks(true, false, "squeeze_all_and_save_before_ls");
const bool consider_unserviced = false;
const bool enable_time_limit = true;
const bool use_work_estimate = true;
const int work_estimate_limit = 10000; // Example value, can be parameterized
const bool enable_cycle_finder = false;
local_search_ptr_->run_best_local_search(
*solution_ptr, consider_unserviced, enable_time_limit, enable_cycle_finder);
*solution_ptr, consider_unserviced, use_work_estimate, work_estimate_limit, enable_cycle_finder);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

These squeeze-time local-search calls are no longer tied to the remaining solve budget.

All three paths now use a fixed work_estimate_limit = 10000. That makes the wall-clock cost of a squeeze step depend on instance size/hardware instead of the remaining GES budget, so one nested local-search call can overrun the caller's latency target. Please derive this budget from the remaining solver budget, or keep a remaining-time escape hatch around these calls.

Also applies to: 340-346, 401-409

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/routing/ges/squeeze.cu` around lines 246 - 253, The fixed
work_estimate_limit (work_estimate_limit = 10000) passed to
local_search_ptr_->run_best_local_search makes squeeze local-search calls ignore
the remaining solver budget; replace the hardcoded constant by deriving a budget
from the solver's remaining budget (or add a remaining-time escape hatch) and
pass that computed work estimate into run_best_local_search, ensuring
solution_ptr->global_runtime_checks is still invoked to enable deterministic
behavior; apply the same change to the equivalent call sites around the other
noted regions (the other squeeze calls at the later blocks) so all squeeze-time
local-search calls respect the remaining solver budget.

@anandhkb
Copy link
Copy Markdown
Contributor

Thanks @Pritiks23 for your PR! Please review the AI codereview bot's comments and kindly address any actionable ones that make sense.

@rg20
Copy link
Copy Markdown
Contributor

rg20 commented Apr 20, 2026

@Pritiks23 Thanks for contributing to cuOpt.

Can you please add the description of changes and in what way these changes are improving the problems you are solving?

@rg20 rg20 requested review from rg20 and removed request for nguidotti April 20, 2026 14:49
@Pritiks23
Copy link
Copy Markdown
Author

Thanks, I added the right description + addressed the AI codeview bot's comments.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
cpp/src/routing/adapters/adapted_modifier.cu (1)

57-61: Work estimate derivation looks reasonable.

The approach of converting time_limit to a work estimate via multiplication is acceptable for this refactor. The TODO indicates the factor needs tuning.

Minor edge case: if time_limit is extremely large (> ~2.1M seconds), the static_cast<int> could overflow. Consider using a saturating cast or clamping to std::numeric_limits<int>::max() before the conversion if this function could receive unbounded input.

🛡️ Optional: saturating conversion
-  int work_estimate_limit = static_cast<int>(time_limit * 1000); // TODO: tune factor
-  if (work_estimate_limit <= 0) work_estimate_limit = 10000;
+  // Saturate to prevent overflow for very large time_limit values
+  constexpr int kDefaultWorkLimit = 10000;
+  constexpr double kTimeFactor = 1000.0; // TODO: tune factor
+  double raw_limit = time_limit * kTimeFactor;
+  int work_estimate_limit = (raw_limit > static_cast<double>(std::numeric_limits<int>::max()))
+                              ? std::numeric_limits<int>::max()
+                              : static_cast<int>(raw_limit);
+  if (work_estimate_limit <= 0) work_estimate_limit = kDefaultWorkLimit;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/routing/adapters/adapted_modifier.cu` around lines 57 - 61, The
conversion of time_limit to work_estimate_limit using
static_cast<int>(time_limit * 1000) can overflow for extremely large time_limit;
update the logic in the block computing work_estimate_limit (where
work_estimate_limit is set and passed to resource.ls.run_best_local_search) to
perform a saturating/clamped conversion: compute the product as a wider type (or
check time_limit against std::numeric_limits<int>::max()/1000), and if it would
exceed int max, set work_estimate_limit = std::numeric_limits<int>::max(); keep
the existing minimum fallback (10000) intact.
cpp/src/routing/ges/squeeze.cu (1)

404-413: LGTM with optional refactor suggestion.

The implementation is correct and consistent with the other call sites.

Consider extracting the work-estimate calculation pattern.

This conversion logic (multiplying by 1000 with fallback to 10000) is now repeated 4 times across adapted_modifier.cu and squeeze.cu. Extracting it to a small helper function would centralize the tuning factor and default value, making future adjustments easier.

♻️ Optional: helper function to reduce duplication

Add to an appropriate header (e.g., local_search.cuh or a new utility header):

inline int compute_work_estimate_from_time(double time_value, 
                                           double factor = 1000.0, 
                                           int default_limit = 10000) {
  int limit = static_cast<int>(time_value * factor);
  return (limit > 0) ? limit : default_limit;
}

Then each call site simplifies to:

int work_estimate_limit = compute_work_estimate_from_time(remaining_time());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/routing/ges/squeeze.cu` around lines 404 - 413, The work-estimate
computation (multiplying remaining_time() by 1000 with a fallback to 10000) is
duplicated; extract it to a small helper (e.g., compute_work_estimate_from_time)
and replace the inline logic in squeeze.cu where
local_search_ptr_->run_best_local_search is called (the current int
work_estimate_limit = static_cast<int>(remaining_time() * 1000); if
(work_estimate_limit <= 0) work_estimate_limit = 10000;). Implement the helper
in a shared header used by adapted_modifier.cu and squeeze.cu (choose
local_search.cuh or a new util header), give it default factor and default_limit
parameters, and call it from the call site (use remaining_time() as the input)
so all four duplicated sites use the central function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cpp/src/routing/adapters/adapted_modifier.cu`:
- Around line 57-61: The conversion of time_limit to work_estimate_limit using
static_cast<int>(time_limit * 1000) can overflow for extremely large time_limit;
update the logic in the block computing work_estimate_limit (where
work_estimate_limit is set and passed to resource.ls.run_best_local_search) to
perform a saturating/clamped conversion: compute the product as a wider type (or
check time_limit against std::numeric_limits<int>::max()/1000), and if it would
exceed int max, set work_estimate_limit = std::numeric_limits<int>::max(); keep
the existing minimum fallback (10000) intact.

In `@cpp/src/routing/ges/squeeze.cu`:
- Around line 404-413: The work-estimate computation (multiplying
remaining_time() by 1000 with a fallback to 10000) is duplicated; extract it to
a small helper (e.g., compute_work_estimate_from_time) and replace the inline
logic in squeeze.cu where local_search_ptr_->run_best_local_search is called
(the current int work_estimate_limit = static_cast<int>(remaining_time() *
1000); if (work_estimate_limit <= 0) work_estimate_limit = 10000;). Implement
the helper in a shared header used by adapted_modifier.cu and squeeze.cu (choose
local_search.cuh or a new util header), give it default factor and default_limit
parameters, and call it from the call site (use remaining_time() as the input)
so all four duplicated sites use the central function.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 9fb7e37a-b384-4727-9f7e-e60522d3ab70

📥 Commits

Reviewing files that changed from the base of the PR and between 7e6004b and 0bed41b.

📒 Files selected for processing (3)
  • cpp/src/routing/adapters/adapted_modifier.cu
  • cpp/src/routing/ges/squeeze.cu
  • cpp/src/routing/local_search/local_search.cu
🚧 Files skipped from review as they are similar to previous changes (1)
  • cpp/src/routing/local_search/local_search.cu

@chris-maes
Copy link
Copy Markdown
Contributor

@Pritiks23 thanks for the PR. I think there may have been some confusion #1069 was about the MIP solver, not the VRP solver.

@mlubin mlubin removed their request for review April 27, 2026 14:03
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEA] Heuristics should use work estimates instead of time limits

4 participants