fix/security: shell-escape attacker-controlled filenames in batch change templates#1332
Conversation
keegancsmith
left a comment
There was a problem hiding this comment.
lib is vendored in from our monorepo, so you need to make changes rather in the monorepo and then vendor it back in here.
I'm also not sure we should be escaping these values. Our templates are not just used for constructing commands in batch changes. I'll let @burmudar comment further, I think he had looked into this stuff not so long ago?? (Or did we both look into it? :) )
Generally batch changes are run on repos you have write access to though... which makes this not really an issue in practice. However, I do see us needing to provide some way to do shell escaping since that means previous uses of this would naively break i fyou r filename had a space in it
| return r.FileMatches | ||
| paths := make([]string, len(r.FileMatches)) | ||
| copy(paths, r.FileMatches) | ||
| sort.Strings(paths) |
There was a problem hiding this comment.
Why not? the previous version of the function sorted the paths.
Yeah we did at some point, it's all a bit muddled now but I took a jog down memory lane. This templating is used to render into
|
I disagree. Even if you have access to the repository, that doesn't necessarily make it a trusted source you want to be running code from. Since the attack vector here is a filename, just by including a malicious repo in your spec can trigger command execution locally. |
Note: this issue was brought to our attention via a HackerOne report.
Problem
src-clirenders batch specrun:fields through Go'stext/template, which performs no output escaping, and thejoinbuiltin is a plainstrings.Join. The rendered string is written to a script file and executed under/bin/shinside the batch-change container.Several template variables expose raw filenames parsed from
git diffoutput and from Sourcegraph search results:repository.search_result_pathssteps.{modified,added,deleted,renamed}_filesprevious_step.{modified,added,deleted,renamed}_filesstep.{modified,added,deleted,renamed}_filesBecause git permits filenames containing shell metacharacters (
`,$,;,|,&,>,<, newlines, …), an attacker who controls filenames in a target repository can inject arbitrary shell commands into the rendered script. The reproducer in the report uses the canonicalgofmt -w ${{ join repository.search_result_paths " " }}example from the Batch Changes docs to exfiltrateSRC_ACCESS_TOKENand tamper with the workspace.This is a supply-chain attack vector: planting a maliciously named file in a public repo is enough to trigger code execution in any organization that runs a batch change matching it. The container has the workspace mounted read-write plus access to
SRC_ACCESS_TOKENand anyenv:secrets.Solution
Shell-escape every filename-bearing value at the template
FuncMapboundary usinggithub.com/kballard/go-shellquote, so the rendered text is always safe to splat into/bin/shregardless of what the underlying spec author wrote.Changes in
lib/batches/template/templating.go:Added a
shellEscapeAll([]string) []stringhelper that runsshellquote.Joinover each element. Elements without metacharacters pass through unmodified, so existing specs targeting benign filenames keep producing identical output.Applied it to the four
*_filesfields in bothStepContext.ToFuncMap(coversstep,previous_step,steps) andChangesetTemplateContext.ToFuncMap(covers changeset templates).Pre-escaped each element of
Repository.SearchResultPaths()so${{ repository.search_result_paths }}and${{ join repository.search_result_paths " " }}are both safe.Kept the slice shape (
[]string) rather than collapsing into a pre-joined string, so${{ join … " " }}and${{ range … }}keep working unchanged for spec authors.Left
stdout/stderrraw. They are routinely captured intooutputsand reused as plain values (e.g. a filename written byecho); pre-quoting silently changes those semantics. Spec authors who splat stdio against untrusted data should opt in with the newshellquote_jointemplate builtin:This trade-off is documented inline next to the assignment.
The fix is purely at the template layer; the
git diffparser inlib/batches/git/changes.gois intentionally left alone so unusual-but-legitimate filenames aren't silently dropped.Verification Evidence
Regression test
Added
lib/batches/template/templating_security_test.go(TestVULN91_NoShellInjectionFromFilenames). It feeds the report's exact PoC filenamesthrough every previously vulnerable variable in both
StepContextandChangesetTemplateContext, and asserts that none of`id,$(,; echo,> /tmp/PWNED, orPWNED2appear outside a single-quoted shell region in the rendered output.Regression test catches the unpatched code
Reverting just the
shellEscapeAll(res.ChangedFiles.Modified)call to its originalres.ChangedFiles.Modifiedand re-running:i.e. the test loudly reproduces the exact injection from the HackerOne report on unpatched code, and passes on the patched code.
Full test suite
Pre-existing executor tests that exercise
${{ join previous_step.modified_files " " }}and${{ previous_step.modified_files }}continue to render the same output for benign filenames (shellquote.Join("modified.txt")→modified.txt), so no batch spec authored against the documented examples sees a behavior change.Test Plan
TestVULN91_NoShellInjectionFromFilenamescovers every affected variable on bothStepContextandChangesetTemplateContext.go test ./...passes for both modules.