Skip to content

security: route scoped snapshot through envelope sentinel escape#1031

Open
garagon wants to merge 1 commit intogarrytan:mainfrom
garagon:security/snapshot-envelope-escape
Open

security: route scoped snapshot through envelope sentinel escape#1031
garagon wants to merge 1 commit intogarrytan:mainfrom
garagon:security/snapshot-envelope-escape

Conversation

@garagon
Copy link
Copy Markdown
Contributor

@garagon garagon commented Apr 16, 2026

Summary

browse/src/content-security.ts defines a trust boundary envelope around page content that gets handed to the LLM:

═══ BEGIN UNTRUSTED WEB CONTENT ═══
...page text...
═══ END UNTRUSTED WEB CONTENT ═══

The wrap path (wrapUntrustedPageContent, the one used by text / html / full-page commands) already splices a zero-width space through the word CONTENT whenever those sentinels appear inside the content itself. That's the defense: a page that renders the literal sentinel string still goes through as visible text, but no longer matches the envelope grep the LLM anchors on.

The scoped-token path in browse/src/snapshot.ts (splitForScoped, used by snapshot and resume for scoped clients) builds the envelope by hand and skips that escape. It just does:

parts.push('═══ BEGIN UNTRUSTED WEB CONTENT ═══');
parts.push(...untrustedLines);   // no escape
parts.push('═══ END UNTRUSTED WEB CONTENT ═══');

So a page whose accessibility tree renders the literal ═══ END UNTRUSTED WEB CONTENT ═══ closes the envelope early, and the attacker can forge a new INTERACTIVE ELEMENTS (trusted …) block with any @eN reference they want. The LLM consuming that output sees a fake trusted @ref that doesn't exist in the real ref map, and if it tries to click/fill it on the rendered page it lands on whatever element the attacker chose to alias.

Reproduction (before the fix)

Stripped-down PoC — runs the two helper functions verbatim against the same hostile input:

bun run report/evidence/poc-f006-confusion-protocol-bypass.ts

Output shows splitForScoped emitting two BEGIN and two END sentinels on the same string (envelope breached), while wrapUntrustedPageContent emits exactly one of each (envelope intact).

Fix

One change, applied in two small steps so each side stays easy to audit.

  1. Extract the existing escape into a named export. content-security.ts now has escapeEnvelopeSentinels(content: string): string — same two .replace() calls, no behavior change on the wrap path. wrapUntrustedPageContent calls it internally.
  2. Use it in the scoped path. snapshot.ts imports escapeEnvelopeSentinels and maps it over untrustedLines in the splitForScoped branch before pushing the BEGIN sentinel.
// browse/src/snapshot.ts — splitForScoped branch
const safeUntrusted = untrustedLines.map(escapeEnvelopeSentinels);
parts.push('═══ BEGIN UNTRUSTED WEB CONTENT ═══');
parts.push(...safeUntrusted);
parts.push('═══ END UNTRUSTED WEB CONTENT ═══');

No changes to sentinel strings, no per-request nonce, no change to the scoped-ref map, no change to wrapUntrustedPageContent output bytes for any non-hostile content. The scoped path now behaves exactly like the wrap path with respect to in-content sentinel escape.

Sibling review

Greped browse/src/ for every emission of ═══ BEGIN UNTRUSTED WEB CONTENT ═══:

Location Source of content Escape Status
content-security.ts wrapUntrustedPageContent full page content string was: inline, now: escapeEnvelopeSentinels unchanged bytes
snapshot.ts splitForScoped branch untrustedLines from accessibility tree was: none fixed
Any other BEGIN emission in browse/src/ n/a n/a n/a

No other call site emits the sentinel. Both paths now funnel untrusted text through the same escape helper.

Tests

browse/test/content-security.test.ts — a new Envelope sentinel escape describe block with 6 cases:

  • escapeEnvelopeSentinels defuses a BEGIN marker inside content.
  • escapeEnvelopeSentinels defuses an END marker inside content.
  • escapeEnvelopeSentinels leaves normal text untouched (identity on non-sentinel lines).
  • wrapUntrustedPageContent emits exactly one real envelope around hostile content that carries a forged BEGIN + END pair (regression for the already-protected wrap path).
  • snapshot.ts imports escapeEnvelopeSentinels from ./content-security.
  • The scoped-snapshot branch in snapshot.ts calls escapeEnvelopeSentinels before pushing the BEGIN sentinel (source-level lock; if a future refactor drops the escape or reorders the calls, the test trips before the diff reaches review).

bun test browse/test/content-security.test.ts: 53 pass, 0 fail.

Negative control

Reverting browse/src/snapshot.ts + browse/src/content-security.ts to origin/main and rerunning the same file:

  • escapeEnvelopeSentinels import fails first (not exported yet).
  • The two source-level regression tests on snapshot.ts fail.

Applying the fix: 53/53 pass.

What stayed the same

  • Sentinel strings unchanged (no new format, no nonce).
  • wrapUntrustedPageContent output bytes unchanged on any input that doesn't contain the sentinel.
  • No change to the scoped @eN ref resolver, no change to splitForScoped's public signature, no change to how scoped snapshots are routed through the CLI.
  • No new dependencies.
  • Instruction block (SECURITY: section in the agent preamble) unchanged. The envelope it tells the LLM to trust is now genuinely one envelope on both code paths.

Files

 browse/src/content-security.ts       | 25 ++++++++++---
 browse/src/snapshot.ts               |  9 ++++-
 browse/test/content-security.test.ts | 71 +++++++++++++++++++++++++++++++++++-
 3 files changed, 98 insertions(+), 7 deletions(-)

How to verify

git checkout security/snapshot-envelope-escape
bun install
bun test browse/test/content-security.test.ts   # 53 pass
bun test                                           # full suite, exit 0

The scoped-token snapshot path in snapshot.ts built its untrusted
block by pushing the raw accessibility-tree lines between the literal
`═══ BEGIN UNTRUSTED WEB CONTENT ═══` / `═══ END UNTRUSTED WEB CONTENT ═══`
sentinels. The full-page wrap path in content-security.ts already
applied a zero-width-space escape on those exact strings to prevent
sentinel injection, but the scoped path skipped it.

Net effect: a page whose rendered text contains the literal sentinel
can close the envelope early from inside untrusted content and forge
a fake "trusted" block for the LLM. That includes fabricating
interactive `@eN` references the agent will act on.

Fix:
  * Extract the zero-width-space escape into a named, exported helper
    `escapeEnvelopeSentinels(content)` in content-security.ts.
  * Have `wrapUntrustedPageContent` call it (behavior unchanged on
    that path — same bytes out).
  * Import the helper in snapshot.ts and map it over `untrustedLines`
    in the `splitForScoped` branch before pushing the BEGIN sentinel.

Tests: add a describe block in content-security.test.ts that covers
  * `escapeEnvelopeSentinels` defuses BEGIN and END markers;
  * `escapeEnvelopeSentinels` leaves normal text untouched;
  * `wrapUntrustedPageContent` still emits exactly one real envelope
    pair when hostile content contains forged sentinels;
  * snapshot.ts imports the helper;
  * the scoped-snapshot branch calls `escapeEnvelopeSentinels` before
    pushing the BEGIN sentinel (source-level regression — if a future
    refactor reorders this, the test trips).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant