test: support fallback chain in CometPlanStabilitySuite, dedupe existing goldens by andygrove · Pull Request #4129 · apache/datafusion-comet

andygrove · 2026-04-28T15:30:41Z

Which issue does this PR close?

Closes #.

Rationale for this change

Each Spark-version-specific TPC-DS plan-stability directory currently stores a full copy of every query's golden plan, even when that query's plan is byte-identical to the previous version's. After the recent Spark 4.1 enablement (#4106), the per-version approved-plans-*-spark4_1 directories duplicated 196 of 206 (v1_4) and 60 of 64 (v2_7) plans against the 4.0 directory, which itself was largely a duplicate of 3.5, which was largely a duplicate of the base. This pattern grows quadratically as new versions are added.

This PR teaches CometPlanStabilitySuite to fall back through the chain of older version directories at read time, so each version-specific directory only needs to contain the queries whose plans actually diverge.

What changes are included in this PR?

CometPlanStabilitySuite gains a protected def fallbackGoldenFilePaths: Seq[String] hook. getDirForTest is unchanged in regenerate mode (always writes to the primary directory). For read-time lookups, when the primary goldenFilePath does not contain a directory for the active query, the suite walks the fallback list and returns the first existing directory.
Both CometTPCDSV1_4_PlanStabilitySuite and CometTPCDSV2_7_PlanStabilitySuite build their fallback chain via CometPlanStabilitySuite.planNameChain(variant), which assembles the right sequence based on isSpark{35,40,41}Plus. The chain ends at the unsuffixed base directory.
When SPARK_GENERATE_GOLDEN_FILES=1, the suite's afterAll() walks the just-written primary directory and drops any query directory whose contents match what the fallback chain resolves to. This mirrors the read-time logic and keeps version-specific directories sparse without any extra bash plumbing.
dev/regenerate-golden-files.sh is now just a thin wrapper that invokes the suites for each Spark version and accepts --spark-version 4.1 (now supported on main).

Applies the prune once across the existing tree, removing 766 duplicate query directories:

directory	before	after
`approved-plans-v1_4-spark3_5`	206	10
`approved-plans-v1_4-spark4_0`	206	12
`approved-plans-v1_4-spark4_1`	206	10
`approved-plans-v2_7-spark3_5`	64	4
`approved-plans-v2_7-spark4_0`	64	4
`approved-plans-v2_7-spark4_1`	64	4

How are these changes tested?

Locally, CometTPCDSV1_4_PlanStabilitySuite (194 tests) and CometTPCDSV2_7_PlanStabilitySuite (64 tests) pass with 0 failures against -Pspark-3.5, -Pspark-4.0, and -Pspark-4.1 after the prune. Spark 3.4 reads only the base directory and is unaffected. CI will exercise all four matrix profiles.

Verified the auto-prune end-to-end: a full regen of -Pspark-4.1 writes all 206 query directories to approved-plans-v1_4-spark4_1/, then afterAll() drops the 196 that match what the fallback chain (4.0 → 3.5 → base) would resolve to, leaving the 10 divergent directories. Repeat regens are idempotent.

Adds a fallback mechanism so that when a Spark-version-specific plan-stability directory does not contain a golden directory for a given query, the suite walks the chain of older-version directories and ultimately the base directory (approved-plans-vX_Y). This lets each version's directory store only the queries whose plans actually changed against the previous version. CometPlanStabilitySuite gains a fallbackGoldenFilePaths: Seq[String] hook. getDirForTest is unchanged in regenerate mode (always writes to the primary directory) and in the case where the primary already has a directory for the query; otherwise it walks the fallback list and returns the first hit. Both V1_4 and V2_7 subclasses build the chain via planNameChain(variant), which assembles the correct sequence based on isSpark3{5,4{0,1}}Plus. When SPARK_GENERATE_GOLDEN_FILES=1, afterAll() walks the just-written primary directory and drops any query directory whose contents match what the fallback chain resolves to, mirroring the read-time logic. This keeps version-specific directories sparse without any extra bash plumbing; dev/regenerate-golden-files.sh just invokes the suites and now also accepts --spark-version 4.1 (now supported on main).

…chain Drops 766 query directories from approved-plans-{v1_4,v2_7}-spark{3_5,4_0,4_1} whose contents are identical to what the fallback chain resolves to (the older-version directory or, ultimately, the base approved-plans-{v1_4,v2_7}). Each version-specific directory now retains only the queries whose plans actually diverge from the previous tier: approved-plans-v1_4-spark3_5: 196 -> 10 dirs (10 diverge from base) approved-plans-v1_4-spark4_0: 194 -> 12 dirs approved-plans-v1_4-spark4_1: 196 -> 10 dirs approved-plans-v2_7-spark3_5: 60 -> 4 dirs approved-plans-v2_7-spark4_0: 60 -> 4 dirs approved-plans-v2_7-spark4_1: 60 -> 4 dirs Verified locally that CometTPCDSV1_4_PlanStabilitySuite (194 tests) and CometTPCDSV2_7_PlanStabilitySuite (64 tests) pass with 0 failures against -Pspark-3.5, -Pspark-4.0, and -Pspark-4.1. Spark 3.4 reads only the base directory and is unaffected.

comphead

Thanks @andygrove this is a nice cleanup, just thinking aloud about dedup can it be a situation when the same query has diff plans between Spark versions?

andygrove · 2026-04-28T22:14:11Z

Thanks @andygrove this is a nice cleanup, just thinking aloud about dedup can it be a situation when the same query has diff plans between Spark versions?

Yes, some queries have different plans between Spark versions. There are some optimizer changes in 4.1 that affected a few queries and produced different plans than 4.0, but most queries were not affected.

comphead · 2026-04-28T22:34:08Z

Yes, some queries have different plans between Spark versions. There are some optimizer changes in 4.1 that affected a few queries and produced different plans than 4.0, but most queries were not affected.

Ic, so if such discrepancy happens we would need to attach the query plans for all specific spark versions like before?

kazuyukitanimura

looks like there is a conflict

kazuyukitanimura · 2026-04-29T00:43:39Z

+      fallbackGoldenFilePaths.iterator
+        .map(p => new File(p, goldenFileName))
+        .find(_.isDirectory)
+        .getOrElse(primary)


is it correct to return primary as default?

andygrove · 2026-04-30T11:22:16Z

Yes, some queries have different plans between Spark versions. There are some optimizer changes in 4.1 that affected a few queries and produced different plans than 4.0, but most queries were not affected.

Ic, so if such discrepancy happens we would need to attach the query plans for all specific spark versions like before?

Yes, we will store all unique plans per query.

andygrove marked this pull request as draft April 28, 2026 15:35

andygrove added 2 commits April 28, 2026 09:48

andygrove force-pushed the plan-stability-fallback-chain branch from 461869e to ab65163 Compare April 28, 2026 15:52

andygrove marked this pull request as ready for review April 28, 2026 21:36

comphead reviewed Apr 28, 2026

View reviewed changes

kazuyukitanimura approved these changes Apr 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: support fallback chain in CometPlanStabilitySuite, dedupe existing goldens#4129

test: support fallback chain in CometPlanStabilitySuite, dedupe existing goldens#4129
andygrove wants to merge 2 commits intoapache:mainfrom
andygrove:plan-stability-fallback-chain

andygrove commented Apr 28, 2026 •

edited

Loading

Uh oh!

comphead left a comment

Uh oh!

andygrove commented Apr 28, 2026

Uh oh!

comphead commented Apr 28, 2026

Uh oh!

kazuyukitanimura left a comment

Uh oh!

kazuyukitanimura Apr 29, 2026

Uh oh!

andygrove commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

andygrove commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

comphead left a comment

Choose a reason for hiding this comment

Uh oh!

andygrove commented Apr 28, 2026

Uh oh!

comphead commented Apr 28, 2026

Uh oh!

kazuyukitanimura left a comment

Choose a reason for hiding this comment

Uh oh!

kazuyukitanimura Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

andygrove commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andygrove commented Apr 28, 2026 •

edited

Loading