perf: batch chroot integration tests to reduce container overhead#845
perf: batch chroot integration tests to reduce container overhead#845
Conversation
Each chroot test previously spawned a fresh Docker container pair (Squid + Agent), adding ~15-25s of overhead per test. With ~73 tests, container lifecycle alone accounted for 17-29 minutes. This introduces a batch runner utility that combines multiple commands sharing the same allowDomains config into a single AWF invocation with structured output delimiters. Individual test cases still appear in Jest output via beforeAll/test pattern. Changes: - Add tests/fixtures/batch-runner.ts: generates batched shell scripts, parses per-command results from delimited output - Refactor all 5 chroot test files to batch compatible commands: - chroot-languages: 20 → 4 invocations - chroot-edge-cases: 19 → 8 invocations - chroot-procfs: 8 → 2 invocations - chroot-package-managers: 23 → 12 invocations - chroot-copilot-home: 3 → 1 invocation - Remove needs: test-chroot-languages from procfs and edge-cases CI jobs so 3 of 4 jobs run in parallel immediately Total: ~73 → ~27 AWF invocations (~63% reduction) Estimated time saved: 11-19 minutes of container overhead Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤 |
|
🎬 THE END — Smoke Claude MISSION: ACCOMPLISHED! The hero saves the day! ✨ |
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 |
✅ Coverage Check PassedOverall Coverage
📁 Per-file Coverage Changes (1 files)
Coverage comparison generated by |
There was a problem hiding this comment.
Pull request overview
This PR reduces Docker container startup overhead in the chroot integration test suite by batching multiple shell commands into fewer AWF invocations, while keeping per-test assertions in Jest. It also updates the CI workflow to run more of the chroot jobs in parallel.
Changes:
- Added a
runBatchutility to execute multiple commands in one AWF container run and parse per-command results. - Refactored 5 chroot integration test files to use batching where
allowDomains/options are shared. - Removed workflow job dependencies so
/procand edge-case chroot tests can start without waiting on language tests.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
tests/fixtures/batch-runner.ts |
New batching helper that generates a delimited bash script and parses per-command stdout/exit codes. |
tests/integration/chroot-procfs.test.ts |
Batches quick /proc checks and Java /proc validation into two AWF invocations. |
tests/integration/chroot-package-managers.test.ts |
Batches package-manager commands by shared domain allowlists to reduce invocations. |
tests/integration/chroot-languages.test.ts |
Batches quick language/version checks into one invocation; keeps longer compile tests unbatched. |
tests/integration/chroot-edge-cases.test.ts |
Batches general localhost-only checks; keeps workdir/exit-code/network tests individual. |
tests/integration/chroot-copilot-home.test.ts |
Batches Copilot home directory write/permission checks into a single invocation. |
.github/workflows/test-chroot.yml |
Removes needs: test-chroot-languages from /proc and edge-case jobs to increase CI parallelism. |
Comments suppressed due to low confidence (1)
tests/fixtures/batch-runner.ts:72
parseResultsbuilds aRegExpusingcmd.namewithout escaping it. If a name contains regex metacharacters (e.g.,.,+,?,[), the exit marker match can be incorrect or fail entirely. Escapecmd.namefor regex (similar to howEXITis escaped) or validate names to a restricted character set before use.
const startToken = `${START}${cmd.name}${DELIM_END}`;
const exitPattern = new RegExp(`${EXIT.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}${cmd.name}:(\\d+)${DELIM_END}`);
const startIdx = stdout.indexOf(startToken);
const exitMatch = stdout.match(exitPattern);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tests/fixtures/batch-runner.ts
Outdated
| function generateScript(commands: BatchCommand[]): string { | ||
| return commands.map(cmd => { | ||
| // Each command runs in a subshell so failures don't abort the batch. | ||
| // stdout and stderr are merged (2>&1) so we capture everything. | ||
| // A blank echo before the EXIT marker ensures a newline separator. | ||
| return [ | ||
| `echo "${START}${cmd.name}${DELIM_END}"`, | ||
| `(${cmd.command}) 2>&1`, | ||
| `echo ""`, | ||
| `echo "${EXIT}${cmd.name}:$?${DELIM_END}"`, |
There was a problem hiding this comment.
cmd.name is interpolated directly into a double-quoted echo in the generated bash script. If a caller passes a name containing characters like ", $, backticks, or newlines, it can break the script and/or spoof delimiter lines, which will corrupt parsing. Consider validating name against a strict safe pattern (e.g., ^[A-Za-z0-9_\-]+$) and throwing early (or properly escaping for bash) before generating the script.
This issue also appears on line 68 of the same file.
| const startIdx = stdout.indexOf(startToken); | ||
| const exitMatch = stdout.match(exitPattern); | ||
|
|
||
| if (startIdx === -1 || !exitMatch) { | ||
| // Command output not found – likely the batch was killed early | ||
| results.set(cmd.name, { stdout: '', exitCode: -1 }); | ||
| continue; | ||
| } | ||
|
|
||
| const contentStart = startIdx + startToken.length; | ||
| const contentEnd = stdout.indexOf(exitMatch[0], contentStart) - 1; // -1 for the blank line | ||
| const cmdStdout = stdout.slice(contentStart, contentEnd).trim(); |
There was a problem hiding this comment.
exitMatch is found via stdout.match(exitPattern), which returns the first match in the whole batch output. If commands contains duplicate names, the second instance will reuse the first instance's exit marker; then stdout.indexOf(exitMatch[0], contentStart) can be -1 and slicing will produce incorrect output/exitCode. Either enforce unique names up front (throw on duplicates) or change parsing to search for the exit marker after the corresponding start token.
Node.js Build Test Results
Overall: PASS ✅ All Node.js test projects successfully installed dependencies and passed their test suites.
|
C++ Build Test Results
Overall: PASS ✅ All C++ projects built successfully.
|
|
Smoke Test Results Last 2 merged PRs:
✅ GitHub MCP Overall: PASS
|
Deno Build Test Results
Overall: ✅ PASS All Deno tests completed successfully.
|
|
Smoke Test Results (Copilot) ✅ Last 2 merged PRs:
✅ GitHub MCP: Retrieved PR data Overall: PASS cc @Mossaka
|
Go Build Test Results
Overall: PASS ✅ All Go projects successfully downloaded dependencies and passed their tests.
|
Bun Build Test Results
Overall: PASS ✅ All Bun projects built and tested successfully.
|
.NET Build Test Results
Overall: PASS All .NET projects successfully restored dependencies, built, and ran without errors.
|
|
Merged PRs: fix: add roles: all to smoke-codex workflow; fix(ci): add missing ANTHROPIC_API_KEY to detection job
|
Rust Build Test Results
Overall: PASS ✅ All Rust projects built and tested successfully.
|
Java Build Test Results ✅All Java projects compiled and tested successfully through the AWF firewall.
Overall: PASS Configuration Notes
|
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 |
|
📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤 |
C++ Build Test Results
Overall: PASS All C++ projects built successfully.
|
Build Test Results: Node.jsAll Node.js test projects passed successfully! ✅
Overall: PASS ✅
|
|
Smoke Test Results 🟢 ✅ GitHub MCP: #843, #841 Status: PASS cc @Mossaka
|
Bun Build Test Results
Overall: PASS ✅ All Bun projects installed and tested successfully.
|
.NET Build Test Results
Overall: PASS ✅ All .NET projects built and ran successfully. NuGet package restore completed without errors.
|
Go Build Test Results
Overall: PASS ✅ All Go projects successfully downloaded dependencies and passed their test suites.
|
Rust Build Test Results
Overall: PASS ✅ All Rust projects built and tested successfully.
|
|
PR titles: fix: fix API proxy sidecar bugs preventing Anthropic-only usage | fix: add roles: all to smoke-codex workflow
|
Java Build Test Results ✅
Overall: PASS All Java projects compiled and tested successfully through the firewall.
|
|
🎬 THE END — Smoke Claude MISSION: ACCOMPLISHED! The hero saves the day! ✨ |
|
📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤 |
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 |
Bun Build Test Results ✅
Overall: PASS All Bun projects built and tested successfully.
|
C++ Build Test Results
Overall: PASS ✅ All C++ projects built successfully.
|
|
Smoke Test Results (Claude) Last 2 Merged PRs:
Test Results:
Status: PASS
|
Build Test: Go - Results
Overall: PASS ✅ All Go projects successfully downloaded dependencies and passed tests.
|
.NET Build Test Results
Overall: PASS ✅ All .NET projects successfully restored, built, and ran.
|
Deno Build Test Results
Overall: ✅ PASS All Deno tests completed successfully.
|
Java Build Test Results
Overall: PASS ✅ All Java projects compiled successfully and all tests passed.
|
Rust Build Test Results
Overall: PASS All Rust projects built and tested successfully.
|
Smoke Test Results (Copilot)Recent PRs:
Test Results:
Status: PARTIAL - 3/4 tests passed (Playwright blocked by GitHub anti-bot) cc @Mossaka
|
|
PR titles:
|
Summary
tests/fixtures/batch-runner.ts— a utility that batches multiple shell commands into a single AWF container invocation with structured delimiters, then parses per-command resultsallowDomainsconfig, while preserving individual test reporting in Jestneeds: test-chroot-languages) fromtest-chroot-procfsandtest-chroot-edge-casesjobs, allowing 3 of 4 jobs to start immediately in parallelInvocation counts
At ~15-25s container overhead per invocation, this saves ~11-19 minutes of pure Docker lifecycle time.
Tests that require individual invocations (exit code propagation, blocking tests, different
containerWorkDiroptions) remain unbatched.Test plan
npm run buildpassesnpm run lintpasses (0 errors)npm test— all 785 unit tests passchroot-languagesjob passeschroot-edge-casesjob passeschroot-procfsjob passeschroot-package-managersjob passes🤖 Generated with Claude Code