Skip to content

Extended integration test playground with real CodeQL databases#171

Draft
data-douser wants to merge 1 commit intomainfrom
dd/extended-tests
Draft

Extended integration test playground with real CodeQL databases#171
data-douser wants to merge 1 commit intomainfrom
dd/extended-tests

Conversation

@data-douser
Copy link
Collaborator

Summary

Add a config-driven extended integration test playground that validates MCP server tools against real CodeQL databases downloaded from GitHub.

Changes

Extended test runner (extensions/vscode/test/extended/)

  • run-extended-tests.ts — 7-scenario test orchestrator with real MCP server instances:

    1. Database resolution and listing
    2. Query execution with BQRS decode
    3. SARIF interpretation
    4. CallGraphFromTo with external predicates
    5. BQRS info tool
    6. BQRS interpretation tool
    7. Multi-database scenario
  • download-databases.ts — Database discovery and download:

    • Discovers existing databases from vscode-codeql storage
    • Downloads via GitHub API (GET /repos/{owner}/{repo}/code-scanning/codeql/databases/{language})
    • Three-tier auth: VS Code session → GH_TOKENgh auth token CLI
  • repos.json — Test configuration for 4 repositories:

    • gin-gonic/gin (Go)
    • expressjs/express (JavaScript)
    • checkstyle/checkstyle (Java)
    • PyCQA/flake8 (Python)
    • Each with CallGraphFromTo source/target function pairs

Build infrastructure

  • esbuild.config.js — Updated for extended test bundling
  • eslint.config.mjs — Updated ESLint config for extended test files
  • extensions/vscode/package.json — Added test:extended script
  • .gitignore — Ignore downloaded databases in test directory

Testing

  • 35/0 passing with 4 real CodeQL databases when run locally
  • Tests are opt-in: only run via npm run test:extended (not part of CI)

Review order

This PR is independent and can be reviewed/merged at any time.

Closes #167
Part of #163

Add config-driven extended test runner that validates MCP server tools
against real CodeQL databases from GitHub.

Infrastructure:
- run-extended-tests.ts: 7-scenario test orchestrator with server log capture
- download-databases.ts: DB discovery from vscode-codeql storage + GitHub API download
- repos.json: 4 test repos (gin-gonic/gin, expressjs/express, checkstyle/checkstyle, PyCQA/flake8)
- Three-tier auth: VS Code session -> GH_TOKEN -> gh auth token

Test scenarios:
- Database resolution and listing
- Query execution (BQRS decode, SARIF interpretation)
- CallGraphFromTo with external predicates
- BQRS info and interpretation tools

Build infrastructure:
- esbuild config updated for extended test bundling
- ESLint config updated for test files
- .gitignore updated for downloaded databases

Closes #167
Copilot AI review requested due to automatic review settings March 25, 2026 11:53
@github-actions
Copy link
Contributor

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 926ea89.
Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

Scanned Files

None

console.log(` ⬇ Downloading: ${owner}/${repoName} (${language})...`);

try {
const response = await fetch(url, {

Check warning

Code scanning / CodeQL

File data in outbound network request Medium test

Outbound network request depends on
file data
.
Outbound network request depends on
file data
.
const nested = join(dbDir, entries[0]);
if (existsSync(join(nested, 'codeql-database.yml'))) {
// Copy all contents up, then remove the nested directory
execFileSync('bash', ['-c', `cp -a "${nested}"/. "${dbDir}/" && rm -rf "${nested}"`]);

Check warning

Code scanning / CodeQL

Shell command built from environment values Medium test

This shell command depends on an uncontrolled
absolute path
.
This shell command depends on an uncontrolled
absolute path
.
This shell command depends on an uncontrolled
absolute path
.

Copilot Autofix

AI about 22 hours ago

In general, to fix this kind of problem you should avoid passing dynamically constructed command strings to a shell (bash -c or sh -c). Instead, either (1) call the underlying program directly with arguments passed as an array, so that they are not interpreted by the shell, or (2) implement the needed filesystem operations directly in your language using standard libraries. Both approaches prevent shell metacharacters in paths from changing the command’s meaning.

For this specific case, the shell is only being used to run cp -a "${nested}"/. "${dbDir}/" && rm -rf "${nested}", which copies the contents of one directory to another and then deletes the source. We can replicate this logic in pure Node.js using fs operations, avoiding bash entirely. Since the surrounding code already uses synchronous filesystem operations (existsSync, mkdirSync, readdirSync, statSync), it is consistent to use synchronous copy and delete functions as well.

The best minimal-impact fix is:

  • Remove the bash usage and implement a small helper that:
    • Ensures the destination directory exists (mkdirSync with { recursive: true }).
    • Recursively copies all files and subdirectories from nested into dbDir. The Node.js fs module (from Node 16.7+ / 18+ depending on tooling) provides cpSync for this, which preserves directory structure and can copy recursively. If available in the project’s runtime, this is the simplest and most robust choice.
    • Removes the nested directory once the copy succeeds, using rmSync(nested, { recursive: true, force: true }) (Node 14.14+).
  • To avoid broad changes, add cpSync and rmSync to the existing fs import in extensions/vscode/test/extended/download-databases.ts and replace only the execFileSync('bash', ...) call with these fs operations.

Concretely:

  • In extensions/vscode/test/extended/download-databases.ts, update the import from fs to include cpSync and rmSync.
  • Replace lines 131–133 (the comment and the execFileSync('bash', ...) line) with a comment plus equivalent cpSync and rmSync calls. Keep the behavior (copy contents of nested into dbDir and then delete nested), and preserve the directory structure.

No changes are required in run-extended-tests.ts; once the shell usage is removed, all three CodeQL variants will be addressed because the tainted path no longer reaches a shell sink.

Suggested changeset 1
extensions/vscode/test/extended/download-databases.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/extensions/vscode/test/extended/download-databases.ts b/extensions/vscode/test/extended/download-databases.ts
--- a/extensions/vscode/test/extended/download-databases.ts
+++ b/extensions/vscode/test/extended/download-databases.ts
@@ -8,7 +8,7 @@
  * Downloads are cached on disk and reused if less than 24 hours old.
  */
 
-import { createWriteStream, existsSync, mkdirSync, readFileSync, readdirSync, statSync } from 'fs';
+import { createWriteStream, existsSync, mkdirSync, readFileSync, readdirSync, statSync, cpSync, rmSync } from 'fs';
 import { join } from 'path';
 import { execFileSync } from 'child_process';
 import { homedir } from 'os';
@@ -129,7 +129,8 @@
       const nested = join(dbDir, entries[0]);
       if (existsSync(join(nested, 'codeql-database.yml'))) {
         // Copy all contents up, then remove the nested directory
-        execFileSync('bash', ['-c', `cp -a "${nested}"/. "${dbDir}/" && rm -rf "${nested}"`]);
+        cpSync(nested, dbDir, { recursive: true });
+        rmSync(nested, { recursive: true, force: true });
       }
     }
 
EOF
@@ -8,7 +8,7 @@
* Downloads are cached on disk and reused if less than 24 hours old.
*/

import { createWriteStream, existsSync, mkdirSync, readFileSync, readdirSync, statSync } from 'fs';
import { createWriteStream, existsSync, mkdirSync, readFileSync, readdirSync, statSync, cpSync, rmSync } from 'fs';
import { join } from 'path';
import { execFileSync } from 'child_process';
import { homedir } from 'os';
@@ -129,7 +129,8 @@
const nested = join(dbDir, entries[0]);
if (existsSync(join(nested, 'codeql-database.yml'))) {
// Copy all contents up, then remove the nested directory
execFileSync('bash', ['-c', `cp -a "${nested}"/. "${dbDir}/" && rm -rf "${nested}"`]);
cpSync(nested, dbDir, { recursive: true });
rmSync(nested, { recursive: true, force: true });
}
}

Copilot is powered by AI and may make mistakes. Always verify output.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in “extended integration test playground” under the VS Code extension that can download/resolve real CodeQL databases and run multi-scenario validation against a locally spawned MCP server.

Changes:

  • Introduces a standalone extended test runner (run-extended-tests.ts) plus a database discovery/download helper (download-databases.ts) and config (repos.json).
  • Updates VS Code extension build + lint config to bundle and allow the extended runner to execute via Node.
  • Adds a new integration test script entry and ignores generated CodeQL CLI diagnostic files.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
extensions/vscode/test/extended/run-extended-tests.ts New orchestrator that resolves databases, spawns the server over stdio, and runs 7 scenarios.
extensions/vscode/test/extended/download-databases.ts New helper to discover existing DBs and download missing ones via GitHub REST API with token fallback.
extensions/vscode/test/extended/repos.json Config for repos/languages and scenario parameters.
extensions/vscode/esbuild.config.js Adds a separate esbuild target for bundling the extended runner.
extensions/vscode/eslint.config.mjs Adds fetch to globals for the extended test code.
extensions/vscode/package.json Adds test:integration:extended script to run the bundled extended runner.
.gitignore Ignores CodeQL CLI diagnostic JSON files.

Comment on lines +157 to +162
const candidates = [
join(home, 'Library', 'Application Support', 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'),
join(home, 'Library', 'Application Support', 'Code', 'User', 'globalStorage', 'github.vscode-codeql'),
join(home, '.config', 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'),
join(home, '.config', 'Code', 'User', 'globalStorage', 'github.vscode-codeql'),
];
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list of vscode-codeql globalStorage locations only covers macOS/Linux. On Windows, the globalStorage directory is typically under %APPDATA%/Code/User/globalStorage/...; without adding Windows candidates, discovery will miss existing databases unless users manually configure search dirs.

Suggested change
const candidates = [
join(home, 'Library', 'Application Support', 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'),
join(home, 'Library', 'Application Support', 'Code', 'User', 'globalStorage', 'github.vscode-codeql'),
join(home, '.config', 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'),
join(home, '.config', 'Code', 'User', 'globalStorage', 'github.vscode-codeql'),
];
const candidates = [
// macOS
join(home, 'Library', 'Application Support', 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'),
join(home, 'Library', 'Application Support', 'Code', 'User', 'globalStorage', 'github.vscode-codeql'),
// Linux
join(home, '.config', 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'),
join(home, '.config', 'Code', 'User', 'globalStorage', 'github.vscode-codeql'),
];
// Windows: %APPDATA%\Code\User\globalStorage\GitHub.vscode-codeql
if (process.platform === 'win32' && process.env.APPDATA) {
candidates.push(
join(process.env.APPDATA, 'Code', 'User', 'globalStorage', 'GitHub.vscode-codeql'),
join(process.env.APPDATA, 'Code', 'User', 'globalStorage', 'github.vscode-codeql'),
);
}

Copilot uses AI. Check for mistakes.
Comment on lines +465 to +467
// Build colon-delimited database dirs for the MCP server
const databaseDirs = [...new Set(Array.from(databases.values()).map(p => resolve(p, '..')))].join(':');

Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CODEQL_DATABASES_BASE_DIRS is parsed using the platform path-list delimiter (path.delimiter: ':' on POSIX, ';' on Windows). Building databaseDirs with ':' will break database discovery on Windows; use path.delimiter when joining database base dirs (and avoid hardcoding ':').

Copilot uses AI. Check for mistakes.
Comment on lines +116 to +120
// Stream to zip file
const dest = createWriteStream(zipPath);
// @ts-expect-error — ReadableStream → NodeJS.ReadableStream interop
await pipeline(response.body, dest);

Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

response.body from fetch() is a Web ReadableStream in Node 18+; stream/promises.pipeline expects a Node stream/AsyncIterable. This is likely to throw at runtime. Convert via Readable.fromWeb(response.body) (or buffer the response) before piping to the file stream.

Copilot uses AI. Check for mistakes.
Comment on lines +226 to +230
// List all cached entries
const lookupResult = await client.callTool({
arguments: {},
name: 'query_results_cache_lookup',
});
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scenario assumes query_results_cache_* tools exist. In the current server build in this repo, those tools are not present, so callTool will fail with “tool not found” and the extended runner will always report failures. To keep this runner independent/opt-in, consider calling client.listTools() up front and skipping cache scenarios when the required tools aren’t available.

Copilot uses AI. Check for mistakes.
Comment on lines +101 to +117
// Capture server stderr for the test report
// The transport exposes the child process after start()
const origStart = transport.start.bind(transport);
transport.start = async function (...args: Parameters<typeof transport.start>) {
const result = await origStart(...args);
// Access the spawned process stderr via the transport's internal process
const proc = (transport as unknown as { _process?: { stderr?: { on: (e: string, cb: (d: Buffer) => void) => void } } })._process;
if (proc?.stderr) {
proc.stderr.on('data', (chunk: Buffer) => {
const text = chunk.toString();
serverLogLines.push(text);
// Also echo to our stderr so the user can see live output
process.stderr.write(text);
});
}
return result;
};
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relies on monkey-patching transport.start and accessing a private _process field to capture stderr. That is brittle against SDK internals changing and may break log capture silently. Prefer a supported stderr hook from the transport (if available) or manage the child process spawning directly for predictable logging.

Suggested change
// Capture server stderr for the test report
// The transport exposes the child process after start()
const origStart = transport.start.bind(transport);
transport.start = async function (...args: Parameters<typeof transport.start>) {
const result = await origStart(...args);
// Access the spawned process stderr via the transport's internal process
const proc = (transport as unknown as { _process?: { stderr?: { on: (e: string, cb: (d: Buffer) => void) => void } } })._process;
if (proc?.stderr) {
proc.stderr.on('data', (chunk: Buffer) => {
const text = chunk.toString();
serverLogLines.push(text);
// Also echo to our stderr so the user can see live output
process.stderr.write(text);
});
}
return result;
};
// Start the transport and capture server stderr for the test report
await transport.start();
const proc = (transport as unknown as {
process?: { stderr?: { on: (e: string, cb: (d: Buffer) => void) => void } };
}).process;
if (proc?.stderr) {
proc.stderr.on('data', (chunk: Buffer) => {
const text = chunk.toString();
serverLogLines.push(text);
// Also echo to our stderr so the user can see live output
process.stderr.write(text);
});
}

Copilot uses AI. Check for mistakes.
Comment on lines +129 to +133
const nested = join(dbDir, entries[0]);
if (existsSync(join(nested, 'codeql-database.yml'))) {
// Copy all contents up, then remove the nested directory
execFileSync('bash', ['-c', `cp -a "${nested}"/. "${dbDir}/" && rm -rf "${nested}"`]);
}
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flattening with bash -c cp ... && rm -rf ... is not cross-platform (Windows) and assumes GNU tools. Prefer implementing the flatten step with Node fs APIs so the downloader works consistently across dev environments.

Copilot uses AI. Check for mistakes.
settings: {
databaseDir: string;
fixtureSearchDirs?: string[];
timeoutMs: number;
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config.settings.timeoutMs is defined in the config schema and repos.json but is never used in the runner. Either wire it into the MCP client/tool-call timeouts (or scenario execution timeout) or remove it from the config to avoid misleading configuration knobs.

Suggested change
timeoutMs: number;

Copilot uses AI. Check for mistakes.
Comment on lines +291 to +295
owner: repo.owner,
repo: repo.repo,
},
name: 'audit_store_findings',
});
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Audit/annotation tool calls are unconditional, but the current server build in this repo does not expose audit_* tools. To keep the extended runner opt-in and resilient across server versions, consider listing tools at startup and skipping Scenario 5 when unsupported (instead of hard-failing).

Copilot uses AI. Check for mistakes.
Comment on lines +188 to +192
format: 'sarif-latest',
queryLanguage: repo.language,
queryName: 'CallGraphFromTo',
sourceFunction,
targetFunction,
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runner uses queryName 'CallGraphFromTo', but the server’s bundled tool queries in this repo currently include CallGraphFrom/CallGraphTo (not CallGraphFromTo). If CallGraphFromTo isn’t guaranteed, consider falling back to the existing queries or skipping with an explicit “unsupported query” message so Scenario 2 validates something useful.

Copilot uses AI. Check for mistakes.
Comment on lines +62 to +66
outdir: 'dist/test/extended',
outfile: undefined,
outExtension: { '.js': '.cjs' },
external: [], // No externals — fully self-contained
logOverride: {
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extendedTestConfig sets external: [], but the extended test code dynamically imports 'vscode'. With no externalization, esbuild will try to resolve/bundle 'vscode' and the build will fail (since it's provided by the extension host, not npm). Keep 'vscode' external for this target so the optional import can be attempted and caught at runtime.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extended integration test playground with real CodeQL databases

2 participants