Overlay: Use overlay-aware CLI version when analyzing PRs#3880
Overlay: Use overlay-aware CLI version when analyzing PRs#3880henrymercer wants to merge 8 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates how the Action selects the default CodeQL CLI version so that, when analyzing pull requests, it can prefer an enabled CLI version that already has cached overlay-base databases for the configured languages (to speed up overlay/incremental analysis), while still respecting feature-flag rollback constraints.
Changes:
- Extend the default CLI “version info” returned from feature flags to include a sorted list of enabled default versions (not just a single version).
- Add PR-aware logic in
setup-codeqlto optionally pick the highest enabled version that intersects with overlay-base DB cache entries (with dry-run telemetry support). - Thread the new version-info shape through callers and update unit tests and the changelog entry.
Show a summary per file
| File | Description |
|---|---|
| src/upload-lib.ts | Switches to the new getEnabledDefaultCliVersions API and passes rawLanguages through initCodeQL. |
| src/testing-utils.ts | Updates test fixtures/mocks for the new CodeQLDefaultVersionInfo.enabledVersions shape. |
| src/start-proxy.ts | Adapts to the new version-info shape by selecting enabledVersions[0] for proxy downloads. |
| src/start-proxy.test.ts | Updates stubbing to getEnabledDefaultCliVersions. |
| src/setup-codeql.ts | Implements overlay-aware default-version resolution for PR analyses (feature-flag gated) and threads rawLanguages. |
| src/setup-codeql.test.ts | Updates call sites for new rawLanguages parameter and adds unit tests for overlay-cache version filtering. |
| src/setup-codeql-action.ts | Updates to getEnabledDefaultCliVersions and passes rawLanguages (currently undefined). |
| src/init.ts | Threads rawLanguages through to setupCodeQL. |
| src/init-action.ts | Uses getEnabledDefaultCliVersions and passes rawLanguages derived from the languages input. |
| src/feature-flags.ts | Introduces CodeQLVersionInfo, changes default version info to enabledVersions[], and adds new overlay match feature flags. |
| src/feature-flags.test.ts | Updates tests to validate multi-version enablement ordering/fallback behavior. |
| src/codeql.ts | Threads rawLanguages into tool setup to support PR-aware version resolution. |
| src/codeql.test.ts | Updates tests for the new default-version info shape and new function signatures. |
| CHANGELOG.md | Adds an UNRELEASED entry describing the experimental overlay-aware default version selection. |
| lib/upload-sarif-action-post.js | Generated JS output (not reviewed). |
| lib/upload-lib.js | Generated JS output (not reviewed). |
| lib/start-proxy-action.js | Generated JS output (not reviewed). |
| lib/start-proxy-action-post.js | Generated JS output (not reviewed). |
| lib/resolve-environment-action.js | Generated JS output (not reviewed). |
| lib/init-action.js | Generated JS output (not reviewed). |
| lib/autobuild-action.js | Generated JS output (not reviewed). |
| lib/analyze-action.js | Generated JS output (not reviewed). |
| lib/analyze-action-post.js | Generated JS output (not reviewed). |
Copilot's findings
Comments suppressed due to low confidence (1)
src/start-proxy.test.ts:1029
- The stub variable is named
getDefaultCliVersion, but it actually stubsgetEnabledDefaultCliVersions. Renaming the local variable (and the later assertion) would avoid confusion and better reflect what’s being tested.
const getDefaultCliVersion = sinon
.stub(features, "getEnabledDefaultCliVersions")
.resolves({
enabledVersions: [{ cliVersion: "2.20.1", tagName: expectedTag }],
});
const path = await startProxyExports.getProxyBinaryPath(logger, features);
t.assert(getDefaultCliVersion.calledOnce);
sinon.assert.calledOnceWithMatch(
- Files reviewed: 14/26 changed files
- Comments generated: 3
| const version = await resolveDefaultCliVersion( | ||
| defaultCliVersion, | ||
| rawLanguages, | ||
| features, | ||
| logger, | ||
| ); | ||
| cliVersion = version.cliVersion; | ||
| tagName = version.tagName; | ||
| } |
mbg
left a comment
There was a problem hiding this comment.
Thanks for breaking this up into (reasonably) sane commits and a separate PR from the previous one! Overall, this looks like a pretty good first stab at this change and I don't see any major issues here.
I have left a bunch of detailed comments. There are a couple in particular about possible issues down the road that we might want to consider documenting or guarding against.
I am wondering if the second FF is a bit overkill and adds an extra layer of complexity on top of an already fairly complex change. I don't feel strongly about this, but the thought crossed my mind whether a reasonable thing to do here to reduce complexity / risk would be to consider breaking this PR up into three separate changes:
- The first commit that modifies some of the existing logic to return all FF-enabled CLI versions, but doesn't make use of this information. We can ship that change without a new FF.
- Then ship the new logic that's guarded by the second FF. I.e. everything except actually using the CLI version that we determined using the new process.
- Then ship the first FF + the logic that uses the CLI version determined using the new process.
| ## [UNRELEASED] | ||
|
|
||
| No user facing changes. | ||
| - Added an experimental change which, when running a Code Scanning analysis for a PR with [improved incremental analysis](https://github.com/github/roadmap/issues/1158) enabled, prefers CodeQL CLI versions that have a cached overlay-base database for the configured languages. This speeds up analysis when the latest version does not yet have a cached overlay-base database. We expect to roll this change out to everyone in May. [#3880](https://github.com/github/codeql-action/pull/3880) |
There was a problem hiding this comment.
| - Added an experimental change which, when running a Code Scanning analysis for a PR with [improved incremental analysis](https://github.com/github/roadmap/issues/1158) enabled, prefers CodeQL CLI versions that have a cached overlay-base database for the configured languages. This speeds up analysis when the latest version does not yet have a cached overlay-base database. We expect to roll this change out to everyone in May. [#3880](https://github.com/github/codeql-action/pull/3880) | |
| - Added an experimental change which, when running a Code Scanning analysis for a PR with [improved incremental analysis](https://github.com/github/roadmap/issues/1158) enabled, prefers CodeQL CLI versions that have a cached overlay-base database for the configured languages. This speeds up analysis for a repository when there is not yet a cached overlay-base database for the latest CLI version. We expect to roll this change out to everyone in May. [#3880](https://github.com/github/codeql-action/pull/3880) |
| enabledVersions: [ | ||
| { | ||
| cliVersion: defaults.cliVersion, | ||
| tagName: defaults.bundleVersion, | ||
| }, | ||
| ], |
There was a problem hiding this comment.
Minor: Could we define this as a constant to avoid some duplication?
| // version, or the one enabled by FFs. | ||
| const versionInfo = useFeaturesToDetermineCLI | ||
| ? await getCliVersionFromFeatures(features) | ||
| ? (await getCliVersionFromFeatures(features)).enabledVersions[0] |
There was a problem hiding this comment.
This is an interesting change. It might mean that the bundle release used by start-proxy and init could diverge. That's certainly not a problem right now, since we don't depend on them lining up. (I.e. we have thus far not had any reason to require the proxy to be of at least a particular version for changes made in the CLI/extractors.)
It could also mean that a base database is extracted using the version of the proxy that shipped with the corresponding CLI, and we then end up using a newer proxy for the overlay analysis on a PR. I don't see an immediate problem with it, but it is worth noting.
We might want to think about whether this is something we want to document somehow so that it doesn't become a problem if we ever want to make that assumption.
| getTemporaryDirectory(), | ||
| gitHubVersion.type, | ||
| codeQLDefaultVersionInfo, | ||
| undefined, // rawLanguages: currently, setup-codeql is not language aware |
There was a problem hiding this comment.
A potential issue down the line if we ever want to expand on setup-codeql and allow it to run before init. It might be worth documenting the implications of this on overlay analysis here.
| const codeQLDefaultVersionInfo = | ||
| await features.getEnabledDefaultCliVersions(gitHubVersion.type); | ||
| toolsFeatureFlagsValid = codeQLDefaultVersionInfo.toolsFeatureFlagsValid; | ||
| const rawLanguages = configUtils.getRawLanguagesNoAutodetect( |
There was a problem hiding this comment.
Could we add a comment here to say what this is for? E.g. "Split the comma-separated input string (if any) into an array of its components to use for identifying overlay base databases".
| sinon.stub(api, "getAutomationID").resolves("test/"); | ||
| sinon.stub(api, "listActionsCaches").resolves([ | ||
| { | ||
| key: "codeql-overlay-base-database-1-aaaaaaaaaaaaaaaa-javascript-2.20.2-abc-1-1", |
There was a problem hiding this comment.
Minor: Could we recycle one of the functions for computing the cache key here so that this doesn't break immediately if we make any changes to the cache key format?
| { | ||
| key: "codeql-overlay-base-database-1-aaaaaaaaaaaaaaaa-javascript-2.20.1-def-2-1", | ||
| }, | ||
| { | ||
| key: "codeql-overlay-base-database-1-aaaaaaaaaaaaaaaa-javascript-2.20.0-ghi-3-1", | ||
| }, |
There was a problem hiding this comment.
Swap the order of these to exercise the "sorted desc" part of the test?
|
|
||
| const result = await setupCodeql.getEnabledVersionsWithOverlayBaseDatabases( | ||
| overlayMatchEnabledVersions, | ||
| undefined, |
There was a problem hiding this comment.
Test name says "empty", but here we pass in undefined directly. What if we get []?
| ); | ||
|
|
||
| test.serial( | ||
| "getEnabledVersionsWithOverlayBaseDatabases includes the highest version when it is cached", |
There was a problem hiding this comment.
Could we make this test name a bit clearer? The test body seems to just set up a single cache and the test then checks that it is included in the results. I don't see how this is specific to "the highest"?
| const useOverlayAwareDefaultCliVersion = !!analysisKinds?.includes( | ||
| AnalysisKind.CodeScanning, | ||
| ); |
There was a problem hiding this comment.
Just to be clear, is the intention to:
- Use this for only
code-scanninganalyses? - Or for any analysis involving
code-scanning?
When analyzing PRs, prefer CLI versions that have cached overlay-base databases to speed up analysis time. However to ensure we can effectively rollback new versions, do not use a CodeQL version whose feature flag is disabled, even if this means running without overlay analysis.
This will be shipped via two feature flags:
overlay_analysis_match_codeql_version_dry_runlogs a diagnostic when the overlay-aware version differs from the latest enabled versionoverlay_analysis_match_codeql_versionuses the overlay-aware version when analysing PRsRisk assessment
For internal use only. Please select the risk level of this change:
Which use cases does this change impact?
Workflow types:
dynamicworkflows (Default Setup, Code Quality, ...).Products:
analysis-kinds: code-scanning.analysis-kinds: code-quality.Environments:
github.comand/or GitHub Enterprise Cloud with Data Residency.How did/will you validate this change?
.test.tsfiles).If something goes wrong after this change is released, what are the mitigation and rollback strategies?
How will you know if something goes wrong after this change is released?
Are there any special considerations for merging or releasing this change?
Merge / deployment checklist