[CALCITE-7422] Support large plan optimization mode for HepPlanner by zhuwenzhuang · Pull Request #4803 · apache/calcite

zhuwenzhuang · 2026-02-24T10:52:24Z

Motivation and details: https://issues.apache.org/jira/browse/CALCITE-7422
Before:
LargePlanBenchmark:100 : 1s
LargePlanBenchmark:1000 : 9s
LargePlanBenchmark:10000 : pretty slow, cannot measure.

CPU Profiler Result(enable fired rules cache and disable large plan mode):
fireRule(RelOptRuleCall ruleCall) takes 2% cpu time.

Mem Profiler Result(enable fired rules cache and disable large plan mode):

After (enable fired rules cache and large plan mode):
LargePlanBenchmark:100 : 1s
LargePlanBenchmark:1000 : 2s
LargePlanBenchmark:10000 : about 60s

Benchmark (unionNum) Mode Cnt Score Error Units LargePlanBenchmark.testLargeUnionPlan 100 avgt 256.561 ms/op LargePlanBenchmark.testLargeUnionPlan 1000 avgt 1616.421 ms/op LargePlanBenchmark.testLargeUnionPlan 10000 avgt 53393.727 ms/op
CPU Profiler Result:
fireRule(RelOptRuleCall ruleCall) takes 11% cpu time. There is still lots of room for CPU optimization.

Mem Profiler Result:
(avoid buildListRecurse/collectGarbage, smaller memory peak size)

zhuwenzhuang · 2026-02-25T11:59:24Z

A better iterater implementation of DFS/BFS is needed. I will optimize this later (unavailable for the next two weeks).

zhuwenzhuang · 2026-03-12T09:47:57Z

After default depth first iterator replaced by HepVertexIterator.

LargePlanBenchmark:10000 takes 3.6s [10000 union (40000 rel nodes), large plan mode + rule cache + ARBITRARY match order]:

CPU Profiler Result
1.1.fireRule(RelOptRuleCall ruleCall) takes 70 % CPU.

1.2. garbageCollection's removeEdge(V source, V target) takes 23 % CPU.

Memory Profiler Result:
Primarily used by rules, with rare use by the planner/iterator itself.

LargePlanBenchmark:100000 takes 68.741 s [400k rel nodes, same configuration]

All perf result of LargePlanBenchmark:
NOTE:NodeCount/RuleTransforms are estimation values from the test' scale.

MatchOrder	UnionNum	NodeCount	RuleTransforms	Time (ms)
ARBITRARY	1,000	4,000	6,006	1,043
ARBITRARY	3,000	12,000	18,006	1,306
ARBITRARY	10,000	40,000	60,006	3,655
ARBITRARY	30,000	120,000	180,006	13,040
DEPTH_FIRST	1,000	4,000	6,006	347
DEPTH_FIRST	3,000	12,000	18,006	1,068
DEPTH_FIRST	10,000	40,000	60,006	4,165
DEPTH_FIRST	30,000	120,000	180,006	12,898
BOTTOM_UP	1,000	4,000	6,006	1,145
BOTTOM_UP	3,000	12,000	18,006	10,152
TOP_DOWN	1,000	4,000	6,006	1,193
TOP_DOWN	3,000	12,000	18,006	8,074

Key optimizations of large plan mode: 1. Reusable graph, avoid reinit. 2. Efficient traversal, skip stable subtree. 3. Fine-grained GC. Usage: see comments of HepPlanner() Perf result of LargePlanBenchmark: Match Order Union Num Node Count Rule Transforms Time (ms) -------------------------------------------------------------------- ARBITRARY 1000 4000 6006 1043 ARBITRARY 3000 12000 18006 1306 ARBITRARY 10000 40000 60006 3655 ARBITRARY 30000 120000 180006 13040 DEPTH_FIRST 1000 4000 6006 347 DEPTH_FIRST 3000 12000 18006 1068 DEPTH_FIRST 10000 40000 60006 4165 DEPTH_FIRST 30000 120000 180006 12898 BOTTOM_UP 1000 4000 6006 1145 BOTTOM_UP 3000 12000 18006 10152 TOP_DOWN 1000 4000 6006 1193 TOP_DOWN 3000 12000 18006 8074

mihaibudiu · 2026-03-17T23:19:28Z

core/src/main/java/org/apache/calcite/plan/hep/HepPlanner.java


  private boolean enableFiredRulesCache = false;

+  private boolean largePlanMode = false;


Is the optimization somehow worse for small plans?
If it's always faster, there should not be such a flag, so you should document it as "To be removed in the future".
If small plans are worse in this mode, the flag should remain, but it still should be documented somehow (e.g., how to choose whether to set it).

The optimization is always better for plans in any scale.
I will add the comment "To be removed in the future".

I'm not very familiar with this optimization; I only took a quick look, but it seems to be quite effective. I do have one concern, though: should we add a specific test to verify the correctness of the execution plan? Alternatively, could we simply remove the largePlanMode configuration entirely (or set it to true for the purpose of comparative testing) and let all existing test cases validate the plan's correctness?

LargePlanBenchmark includes a test compare different match orders’ rule match count. it is a specific test to verify the correctness.

I think this also depends on how the community views the specific compatibility impact of this change. The optimization still guarantees that planning does not stop until no more rules can be matched.
However, if a user has some hacky rules that rely on the traversal order strictly following a fixed pattern, the optimization result may change for those users. I believe such users are actually using HepPlanner incorrectly. If everyone thinks the risk of enabling this by default is acceptable, I would prefer to turn this optimization on by default.

And if the type digest flag is false by default, the main performance bottleneck is type digest hashing and comparison.

Thank you for the explanation. Regarding the first question, my concern is whether a single case can effectively cover a wide range of scenarios. As for the second question, while it is true that HepPlanner supports several traversal strategies, I believe that the current optimization should not interfere with these traversal methods—since they are currently supported, we must ensure backward compatibility. If a specific traversal strategy were to be adversely affected, shouldn't we consider disabling this optimization in such instances? I would appreciate it if you could let me know whether my understanding is correct.

1.Maybe I can run all tests with large plan mode/rule cache/type digest enabled, and see the results.
2.Not adversely affected. If someone uses the match limit and match order to do some match order sensitive stuff, it might break.

For the first point, I'm concerned that a one-time review of the results doesn't establish a standardized process for future iterations. For the second point, I want to ask whether this optimization can avoid these issues, or if it can automatically fall back to a degraded mode with certain constraints when users employ such methods. Is this feasible to implement?

I agree with @xiedeyantu for 1., it's OK for now but unfortunately it won't protect from future regressions.

For 2., I think I got what you mean, but if you could provide a concrete example, it would help making sure we are aligned, backwards compatibility is a major concern for libraries like Calcite.

Thank you for the review. @xiedeyantu @asolimando

I added regression coverage for HepPlanner-related cases.
I also added a concrete compatibility-sensitive case in HepPlannerTest.

core/src/main/java/org/apache/calcite/plan/hep/HepPlanner.java

mihaibudiu · 2026-03-17T23:44:23Z

If you make changes, please use new commits for now.

- Add Javadoc for largePlanMode field with "To be removed in the future" note - Rename clear() to clearRules() for clarity in multiphase optimization - Update usage example to use clearRules() and fix "graph is reused" to "graph is preserved" - Remove UnsupportedOperationException in findBestExp() for large plan mode - Fix comment about garbage collection in getGraphIterator() - Add comment about caching fired rule before constructing HepRuleCall - Fix "different with" to "different from" in assertion error messages

mihaibudiu

I have approved, but please consider adding the two extra comments I have suggested.

sonarqubecloud · 2026-03-21T08:43:55Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
63.7% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

zhuwenzhuang force-pushed the large_plan_mode branch 6 times, most recently from d87d57a to d4ae34e Compare February 24, 2026 15:35

zhuwenzhuang marked this pull request as draft February 24, 2026 15:38

zhuwenzhuang force-pushed the large_plan_mode branch from d4ae34e to eefc51c Compare February 24, 2026 16:02

zhuwenzhuang mentioned this pull request Feb 25, 2026

[CALCITE-7417] Add a large plan benchmark for HepPlanner #4796

Merged

zhuwenzhuang force-pushed the large_plan_mode branch 2 times, most recently from 6369ba1 to 6d9ab05 Compare February 25, 2026 06:12

zhuwenzhuang changed the title ~~[CALCITE-7422] Support large plan optimizaion mode for HepPlanner~~ [CALCITE-7422] Support large plan optimization mode for HepPlanner Mar 1, 2026

zhuwenzhuang force-pushed the large_plan_mode branch from 6d9ab05 to a5805c8 Compare March 1, 2026 02:38

zhuwenzhuang force-pushed the large_plan_mode branch 7 times, most recently from 4e77147 to 3870056 Compare March 12, 2026 09:46

zhuwenzhuang force-pushed the large_plan_mode branch 2 times, most recently from 90c7cd5 to 9ed2932 Compare March 12, 2026 10:31

zhuwenzhuang marked this pull request as ready for review March 12, 2026 11:02

zhuwenzhuang force-pushed the large_plan_mode branch from 9ed2932 to 7aabe9d Compare March 12, 2026 11:42

mihaibudiu reviewed Mar 17, 2026

View reviewed changes

mihaibudiu approved these changes Mar 18, 2026

View reviewed changes

mihaibudiu added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Mar 18, 2026

zhuwenzhuang added 3 commits March 19, 2026 15:42

[CALCITE-7422] Add more commets

532e2f2

[CALCITE-7422] Add Hep large-plan regression coverage

63720eb

[CALCITE-7422] Resolve style violations

a206f67

zhuwenzhuang force-pushed the large_plan_mode branch from 00e9545 to a206f67 Compare March 21, 2026 08:23


		private boolean enableFiredRulesCache = false;

		private boolean largePlanMode = false;

Conversation

zhuwenzhuang commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhuwenzhuang commented Feb 25, 2026

Uh oh!

zhuwenzhuang commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mihaibudiu commented Mar 17, 2026

Uh oh!

mihaibudiu left a comment

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Mar 21, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zhuwenzhuang commented Feb 24, 2026 •

edited

Loading

zhuwenzhuang commented Mar 12, 2026 •

edited

Loading