Skip to content

security: gate download + scrape through validateNavigationUrl (SSRF)#1029

Open
garagon wants to merge 1 commit intogarrytan:mainfrom
garagon:security/download-scrape-ssrf
Open

security: gate download + scrape through validateNavigationUrl (SSRF)#1029
garagon wants to merge 1 commit intogarrytan:mainfrom
garagon:security/download-scrape-ssrf

Conversation

@garagon
Copy link
Copy Markdown
Contributor

@garagon garagon commented Apr 16, 2026

Summary

goto already routes every URL through validateNavigationUrl. The download and scrape commands don't. They call page.request.fetch(url, { timeout: 30000 }) directly against a caller-supplied URL, and the response body either comes back to the client as base64 or lands on disk where GET /file serves it.

That means a caller with the default write scope can do:

curl -H "Authorization: Bearer $T" \
     -X POST https://HOST/command \
     -d '{"command":"download","args":["http://169.254.169.254/latest/meta-data/iam/security-credentials/"]}'

and receive the AWS IMDSv1 body. Same trick with metadata.google.internal, metadata.azure.internal, [fd00::], 0xA9FEA9FE, or any internal IPv4/IPv6 the daemon happens to route to. The scrape command multiplies the exposure because it loops over every <a href> on a page and fetches each one with the same unguarded path, so a single attacker-controlled page turns into N unvalidated fetches.

The interesting part is that the blocklist already exists and already protects goto. The two call sites just don't use it.

Root cause

browse/src/write-commands.ts — the download branch (around line 1002) and the inner for loop of the scrape branch (around line 1099) both go straight from const url to page.request.fetch(url, ...). No SSRF gate, no scheme check, no metadata-host check. This sits behind validateAuth, but the write-scope bearer is the same token that /tunnel/start exposes and the one closed by #1026, so treating the auth boundary as a trust boundary for the outbound HTTP client is a weaker posture than goto already takes.

Fix

Call validateNavigationUrl(url) immediately before each page.request.fetch call site. The validator already covers:

  • Scheme allowlist (http / https) — blocks file://, javascript:, data:, chrome://.
  • Cloud metadata hostnames (metadata.google.internal, metadata.azure.internal) including trailing-dot variants.
  • 169.254.169.254 in every encoding (dotted, hex 0xA9FEA9FE, decimal 2852039166, octal 0251.0376.0251.0376).
  • IPv6 ULA (fc00::/7, fd00:: and friends) with bracket-form URL parsing.
  • Malformed URLs throw upfront instead of reaching fetch.

So the fix is one await on each side. No new allowlist, no new config, no behavior change for callers that were already passing valid public URLs.

Reproduction (before the fix)

# Terminal 1
bun run dev serve &
T=$(cat ~/.gstack/auth-token)

# Terminal 2
curl -H "Authorization: Bearer $T" \
     -X POST http://localhost:7777/command \
     -d '{"command":"download","args":["http://169.254.169.254/latest/meta-data/","--base64"]}'
# -> base64 body of AWS IMDSv1 metadata root

After the fix, the same request returns URL scheme "http://169.254.169.254" is not allowed (cloud metadata).

Sibling review

Grepped browse/src/ for every page.request.fetch( call that takes a caller-supplied URL:

Location Source of URL Gate Status
write-commands.ts download (~1012) args[0] was: none fixed
write-commands.ts scrape loop (~1113) <a href> harvested from page was: none fixed
goto command (server.ts) args[0] validateNavigationUrl unaffected
Any other .request.fetch( in browse none found taking caller input n/a n/a

The new regression test walks browse/src/write-commands.ts and asserts every await page.request.fetch( sits under a preceding validateNavigationUrl call. If a future refactor adds a third caller-facing fetch path or drops one of these gates, the test turns red before the diff reaches review.

Tests

browse/test/url-validation.test.ts already covers the validator's blocklist (30 cases: schemes, metadata in every encoding, IPv6 ULA edges, hostname false positives). This PR adds a download + scrape SSRF gate describe block that reads write-commands.ts as a string and asserts the source-level invariant:

it('every page.request.fetch sits under a preceding validateNavigationUrl', () => {
  const fetches = callsitesOf('await page.request.fetch(');
  expect(fetches.length).toBeGreaterThan(0);
  for (const idx of fetches) {
    const lead = WRITE_COMMANDS_SRC.slice(Math.max(0, idx - 400), idx);
    expect(lead).toMatch(/validateNavigationUrl\s*\(/);
  }
});

Framework-independent, DB-free, runs in under a second. bun test browse/test/url-validation.test.ts: 30 pass, 0 fail.

Negative control

Reverting browse/src/write-commands.ts to origin/main and rerunning:

27 pass
 3 fail

All three failures are the new regression tests. Applying the fix: 30/30 pass.

What stayed the same

  • No change to validateNavigationUrl — same allowlist, same blocklist.
  • No change to auth, command routing, or write-scope semantics.
  • No new dependencies.
  • goto was already correct and remains unchanged.
  • The download blob-URL branch (different strategy, feeds off page.evaluate and base64-decodes) is not affected since it never hits page.request.fetch.

Files

 browse/src/write-commands.ts       | 15 +++++++-
 browse/test/url-validation.test.ts | 74 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 88 insertions(+), 1 deletion(-)

How to verify

git checkout security/download-scrape-ssrf
bun install
bun test browse/test/url-validation.test.ts   # 30 pass
bun test                                       # full suite, exit 0

The `goto` command was correctly wired through validateNavigationUrl,
but `download` and `scrape` called page.request.fetch(url, ...) directly.
A caller with the default write scope could hit the /command endpoint
and ask the daemon to fetch http://169.254.169.254/latest/meta-data/
(AWS IMDSv1) or the GCP/Azure/internal equivalents. The response body
comes back as base64 or lands on disk where GET /file serves it.

Fix: call validateNavigationUrl(url) immediately before each
page.request.fetch() call site in download and in the scrape loop.
Same blocklist that already protects `goto`: file://, javascript:,
data:, chrome://, cloud metadata (IPv4 all encodings, IPv6 ULA,
metadata.*.internal).

Tests: extend browse/test/url-validation.test.ts with a source-level
guard that walks every `await page.request.fetch(` call site and
asserts a validateNavigationUrl call precedes it within the same
branch. Regression trips before code review if a future refactor
drops the gate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant