Extended integration test playground with real CodeQL databases#171
Extended integration test playground with real CodeQL databases#171data-douser wants to merge 1 commit intomainfrom
Conversation
Add config-driven extended test runner that validates MCP server tools against real CodeQL databases from GitHub. Infrastructure: - run-extended-tests.ts: 7-scenario test orchestrator with server log capture - download-databases.ts: DB discovery from vscode-codeql storage + GitHub API download - repos.json: 4 test repos (gin-gonic/gin, expressjs/express, checkstyle/checkstyle, PyCQA/flake8) - Three-tier auth: VS Code session -> GH_TOKEN -> gh auth token Test scenarios: - Database resolution and listing - Query execution (BQRS decode, SARIF interpretation) - CallGraphFromTo with external predicates - BQRS info and interpretation tools Build infrastructure: - esbuild config updated for extended test bundling - ESLint config updated for test files - .gitignore updated for downloaded databases Closes #167
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Snapshot WarningsEnsure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice. Scanned FilesNone |
| console.log(` ⬇ Downloading: ${owner}/${repoName} (${language})...`); | ||
|
|
||
| try { | ||
| const response = await fetch(url, { |
Check warning
Code scanning / CodeQL
File data in outbound network request Medium test
| const nested = join(dbDir, entries[0]); | ||
| if (existsSync(join(nested, 'codeql-database.yml'))) { | ||
| // Copy all contents up, then remove the nested directory | ||
| execFileSync('bash', ['-c', `cp -a "${nested}"/. "${dbDir}/" && rm -rf "${nested}"`]); |
Check warning
Code scanning / CodeQL
Shell command built from environment values Medium test
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 22 hours ago
In general, to fix this kind of problem you should avoid passing dynamically constructed command strings to a shell (bash -c or sh -c). Instead, either (1) call the underlying program directly with arguments passed as an array, so that they are not interpreted by the shell, or (2) implement the needed filesystem operations directly in your language using standard libraries. Both approaches prevent shell metacharacters in paths from changing the command’s meaning.
For this specific case, the shell is only being used to run cp -a "${nested}"/. "${dbDir}/" && rm -rf "${nested}", which copies the contents of one directory to another and then deletes the source. We can replicate this logic in pure Node.js using fs operations, avoiding bash entirely. Since the surrounding code already uses synchronous filesystem operations (existsSync, mkdirSync, readdirSync, statSync), it is consistent to use synchronous copy and delete functions as well.
The best minimal-impact fix is:
- Remove the
bashusage and implement a small helper that:- Ensures the destination directory exists (
mkdirSyncwith{ recursive: true }). - Recursively copies all files and subdirectories from
nestedintodbDir. The Node.jsfsmodule (from Node 16.7+ / 18+ depending on tooling) providescpSyncfor this, which preserves directory structure and can copy recursively. If available in the project’s runtime, this is the simplest and most robust choice. - Removes the
nesteddirectory once the copy succeeds, usingrmSync(nested, { recursive: true, force: true })(Node 14.14+).
- Ensures the destination directory exists (
- To avoid broad changes, add
cpSyncandrmSyncto the existingfsimport inextensions/vscode/test/extended/download-databases.tsand replace only theexecFileSync('bash', ...)call with thesefsoperations.
Concretely:
- In
extensions/vscode/test/extended/download-databases.ts, update the import fromfsto includecpSyncandrmSync. - Replace lines 131–133 (the comment and the
execFileSync('bash', ...)line) with a comment plus equivalentcpSyncandrmSynccalls. Keep the behavior (copy contents ofnestedintodbDirand then deletenested), and preserve the directory structure.
No changes are required in run-extended-tests.ts; once the shell usage is removed, all three CodeQL variants will be addressed because the tainted path no longer reaches a shell sink.
| @@ -8,7 +8,7 @@ | ||
| * Downloads are cached on disk and reused if less than 24 hours old. | ||
| */ | ||
|
|
||
| import { createWriteStream, existsSync, mkdirSync, readFileSync, readdirSync, statSync } from 'fs'; | ||
| import { createWriteStream, existsSync, mkdirSync, readFileSync, readdirSync, statSync, cpSync, rmSync } from 'fs'; | ||
| import { join } from 'path'; | ||
| import { execFileSync } from 'child_process'; | ||
| import { homedir } from 'os'; | ||
| @@ -129,7 +129,8 @@ | ||
| const nested = join(dbDir, entries[0]); | ||
| if (existsSync(join(nested, 'codeql-database.yml'))) { | ||
| // Copy all contents up, then remove the nested directory | ||
| execFileSync('bash', ['-c', `cp -a "${nested}"/. "${dbDir}/" && rm -rf "${nested}"`]); | ||
| cpSync(nested, dbDir, { recursive: true }); | ||
| rmSync(nested, { recursive: true, force: true }); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Pull request overview
Adds an opt-in “extended integration test playground” under the VS Code extension that can download/resolve real CodeQL databases and run multi-scenario validation against a locally spawned MCP server.
Changes:
- Introduces a standalone extended test runner (
run-extended-tests.ts) plus a database discovery/download helper (download-databases.ts) and config (repos.json). - Updates VS Code extension build + lint config to bundle and allow the extended runner to execute via Node.
- Adds a new integration test script entry and ignores generated CodeQL CLI diagnostic files.
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| extensions/vscode/test/extended/run-extended-tests.ts | New orchestrator that resolves databases, spawns the server over stdio, and runs 7 scenarios. |
| extensions/vscode/test/extended/download-databases.ts | New helper to discover existing DBs and download missing ones via GitHub REST API with token fallback. |
| extensions/vscode/test/extended/repos.json | Config for repos/languages and scenario parameters. |
| extensions/vscode/esbuild.config.js | Adds a separate esbuild target for bundling the extended runner. |
| extensions/vscode/eslint.config.mjs | Adds fetch to globals for the extended test code. |
| extensions/vscode/package.json | Adds test:integration:extended script to run the bundled extended runner. |
| .gitignore | Ignores CodeQL CLI diagnostic JSON files. |
| const candidates = [ | ||
| join(home, 'Library', 'Application Support', 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'), | ||
| join(home, 'Library', 'Application Support', 'Code', 'User', 'globalStorage', 'github.vscode-codeql'), | ||
| join(home, '.config', 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'), | ||
| join(home, '.config', 'Code', 'User', 'globalStorage', 'github.vscode-codeql'), | ||
| ]; |
There was a problem hiding this comment.
This list of vscode-codeql globalStorage locations only covers macOS/Linux. On Windows, the globalStorage directory is typically under %APPDATA%/Code/User/globalStorage/...; without adding Windows candidates, discovery will miss existing databases unless users manually configure search dirs.
| const candidates = [ | |
| join(home, 'Library', 'Application Support', 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'), | |
| join(home, 'Library', 'Application Support', 'Code', 'User', 'globalStorage', 'github.vscode-codeql'), | |
| join(home, '.config', 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'), | |
| join(home, '.config', 'Code', 'User', 'globalStorage', 'github.vscode-codeql'), | |
| ]; | |
| const candidates = [ | |
| // macOS | |
| join(home, 'Library', 'Application Support', 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'), | |
| join(home, 'Library', 'Application Support', 'Code', 'User', 'globalStorage', 'github.vscode-codeql'), | |
| // Linux | |
| join(home, '.config', 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'), | |
| join(home, '.config', 'Code', 'User', 'globalStorage', 'github.vscode-codeql'), | |
| ]; | |
| // Windows: %APPDATA%\Code\User\globalStorage\GitHub.vscode-codeql | |
| if (process.platform === 'win32' && process.env.APPDATA) { | |
| candidates.push( | |
| join(process.env.APPDATA, 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'), | |
| join(process.env.APPDATA, 'Code', 'User', 'globalStorage', 'github.vscode-codeql'), | |
| ); | |
| } |
| // Build colon-delimited database dirs for the MCP server | ||
| const databaseDirs = [...new Set(Array.from(databases.values()).map(p => resolve(p, '..')))].join(':'); | ||
|
|
There was a problem hiding this comment.
CODEQL_DATABASES_BASE_DIRS is parsed using the platform path-list delimiter (path.delimiter: ':' on POSIX, ';' on Windows). Building databaseDirs with ':' will break database discovery on Windows; use path.delimiter when joining database base dirs (and avoid hardcoding ':').
| // Stream to zip file | ||
| const dest = createWriteStream(zipPath); | ||
| // @ts-expect-error — ReadableStream → NodeJS.ReadableStream interop | ||
| await pipeline(response.body, dest); | ||
|
|
There was a problem hiding this comment.
response.body from fetch() is a Web ReadableStream in Node 18+; stream/promises.pipeline expects a Node stream/AsyncIterable. This is likely to throw at runtime. Convert via Readable.fromWeb(response.body) (or buffer the response) before piping to the file stream.
| // List all cached entries | ||
| const lookupResult = await client.callTool({ | ||
| arguments: {}, | ||
| name: 'query_results_cache_lookup', | ||
| }); |
There was a problem hiding this comment.
This scenario assumes query_results_cache_* tools exist. In the current server build in this repo, those tools are not present, so callTool will fail with “tool not found” and the extended runner will always report failures. To keep this runner independent/opt-in, consider calling client.listTools() up front and skipping cache scenarios when the required tools aren’t available.
| // Capture server stderr for the test report | ||
| // The transport exposes the child process after start() | ||
| const origStart = transport.start.bind(transport); | ||
| transport.start = async function (...args: Parameters<typeof transport.start>) { | ||
| const result = await origStart(...args); | ||
| // Access the spawned process stderr via the transport's internal process | ||
| const proc = (transport as unknown as { _process?: { stderr?: { on: (e: string, cb: (d: Buffer) => void) => void } } })._process; | ||
| if (proc?.stderr) { | ||
| proc.stderr.on('data', (chunk: Buffer) => { | ||
| const text = chunk.toString(); | ||
| serverLogLines.push(text); | ||
| // Also echo to our stderr so the user can see live output | ||
| process.stderr.write(text); | ||
| }); | ||
| } | ||
| return result; | ||
| }; |
There was a problem hiding this comment.
This relies on monkey-patching transport.start and accessing a private _process field to capture stderr. That is brittle against SDK internals changing and may break log capture silently. Prefer a supported stderr hook from the transport (if available) or manage the child process spawning directly for predictable logging.
| // Capture server stderr for the test report | |
| // The transport exposes the child process after start() | |
| const origStart = transport.start.bind(transport); | |
| transport.start = async function (...args: Parameters<typeof transport.start>) { | |
| const result = await origStart(...args); | |
| // Access the spawned process stderr via the transport's internal process | |
| const proc = (transport as unknown as { _process?: { stderr?: { on: (e: string, cb: (d: Buffer) => void) => void } } })._process; | |
| if (proc?.stderr) { | |
| proc.stderr.on('data', (chunk: Buffer) => { | |
| const text = chunk.toString(); | |
| serverLogLines.push(text); | |
| // Also echo to our stderr so the user can see live output | |
| process.stderr.write(text); | |
| }); | |
| } | |
| return result; | |
| }; | |
| // Start the transport and capture server stderr for the test report | |
| await transport.start(); | |
| const proc = (transport as unknown as { | |
| process?: { stderr?: { on: (e: string, cb: (d: Buffer) => void) => void } }; | |
| }).process; | |
| if (proc?.stderr) { | |
| proc.stderr.on('data', (chunk: Buffer) => { | |
| const text = chunk.toString(); | |
| serverLogLines.push(text); | |
| // Also echo to our stderr so the user can see live output | |
| process.stderr.write(text); | |
| }); | |
| } |
| const nested = join(dbDir, entries[0]); | ||
| if (existsSync(join(nested, 'codeql-database.yml'))) { | ||
| // Copy all contents up, then remove the nested directory | ||
| execFileSync('bash', ['-c', `cp -a "${nested}"/. "${dbDir}/" && rm -rf "${nested}"`]); | ||
| } |
There was a problem hiding this comment.
Flattening with bash -c cp ... && rm -rf ... is not cross-platform (Windows) and assumes GNU tools. Prefer implementing the flatten step with Node fs APIs so the downloader works consistently across dev environments.
| settings: { | ||
| databaseDir: string; | ||
| fixtureSearchDirs?: string[]; | ||
| timeoutMs: number; |
There was a problem hiding this comment.
config.settings.timeoutMs is defined in the config schema and repos.json but is never used in the runner. Either wire it into the MCP client/tool-call timeouts (or scenario execution timeout) or remove it from the config to avoid misleading configuration knobs.
| timeoutMs: number; |
| owner: repo.owner, | ||
| repo: repo.repo, | ||
| }, | ||
| name: 'audit_store_findings', | ||
| }); |
There was a problem hiding this comment.
Audit/annotation tool calls are unconditional, but the current server build in this repo does not expose audit_* tools. To keep the extended runner opt-in and resilient across server versions, consider listing tools at startup and skipping Scenario 5 when unsupported (instead of hard-failing).
| format: 'sarif-latest', | ||
| queryLanguage: repo.language, | ||
| queryName: 'CallGraphFromTo', | ||
| sourceFunction, | ||
| targetFunction, |
There was a problem hiding this comment.
The runner uses queryName 'CallGraphFromTo', but the server’s bundled tool queries in this repo currently include CallGraphFrom/CallGraphTo (not CallGraphFromTo). If CallGraphFromTo isn’t guaranteed, consider falling back to the existing queries or skipping with an explicit “unsupported query” message so Scenario 2 validates something useful.
| outdir: 'dist/test/extended', | ||
| outfile: undefined, | ||
| outExtension: { '.js': '.cjs' }, | ||
| external: [], // No externals — fully self-contained | ||
| logOverride: { |
There was a problem hiding this comment.
extendedTestConfig sets external: [], but the extended test code dynamically imports 'vscode'. With no externalization, esbuild will try to resolve/bundle 'vscode' and the build will fail (since it's provided by the extension host, not npm). Keep 'vscode' external for this target so the optional import can be attempted and caught at runtime.
Summary
Add a config-driven extended integration test playground that validates MCP server tools against real CodeQL databases downloaded from GitHub.
Changes
Extended test runner (
extensions/vscode/test/extended/)run-extended-tests.ts— 7-scenario test orchestrator with real MCP server instances:download-databases.ts— Database discovery and download:GET /repos/{owner}/{repo}/code-scanning/codeql/databases/{language})GH_TOKEN→gh auth tokenCLIrepos.json— Test configuration for 4 repositories:gin-gonic/gin(Go)expressjs/express(JavaScript)checkstyle/checkstyle(Java)PyCQA/flake8(Python)Build infrastructure
esbuild.config.js— Updated for extended test bundlingeslint.config.mjs— Updated ESLint config for extended test filesextensions/vscode/package.json— Addedtest:extendedscript.gitignore— Ignore downloaded databases in test directoryTesting
npm run test:extended(not part of CI)Review order
This PR is independent and can be reviewed/merged at any time.
Closes #167
Part of #163