security: extend hidden-element detection to every DOM-reading channel by garagon · Pull Request #1032 · garrytan/gstack

garagon · 2026-04-16T23:33:59Z

Summary

The Confusion Protocol envelope wrap already covers every scoped PAGE_CONTENT_COMMAND — text, html, links, forms, accessibility, attrs, console, dialog, media, data, ux-audit. The hidden-element / ARIA-injection detection layer (markHiddenElements + getCleanTextWithStripping in content-security.ts) only ran when command === 'text'. For every other DOM-reading channel the output went through the envelope with no hidden-content filter.

Net effect: a page that carries a display:none div with payload text, or a button with an aria-label matching one of the injection patterns, has that payload leak to the LLM the moment the agent calls browse html, browse accessibility, browse attrs, or any of the other non-text channels. The envelope wrap by itself does not mitigate this — it only tells the model that content is untrusted, not that the visible page never actually rendered those bytes to the human operator.

Reproduction (before the fix)

bun run report/evidence/poc-f008-channel-gaps.ts

The PoC simulates the dispatcher against a page that renders a hidden injection div plus an aria-label injection, then walks every PAGE_CONTENT_COMMAND. With the current code, every non-text channel emits the hidden payload inside the envelope. Expected after the fix: no channel returns the raw payload without a CONTENT WARNINGS header flagging it.

Root cause

browse/src/server.ts around the read dispatch:

if (isScoped && command === 'text') {
  const page = session.getPage();
  await markHiddenElements(page);
  // ...
  result = await getCleanTextWithStripping(target);
}

The condition gates on the literal 'text' string. Every other scoped channel falls through to handleReadCommand and never touches the hidden-element detector. The envelope wrap later in the same handler does not know which content came from hidden nodes, so it cannot flag them.

Fix

Two parts, each small on its own.

New export DOM_CONTENT_COMMANDS in browse/src/commands.ts. Subset of PAGE_CONTENT_COMMANDS whose output is derived from the live DOM tree: text, html, links, forms, accessibility, attrs, media, data, ux-audit. console and dialog stay out — they read separate runtime state (captured console output, queued dialog events), so running the DOM detector would be wasteful and potentially racy against navigation.

Dispatcher gates on the set, not on text. browse/src/server.ts now does:

let hiddenContentWarnings: string[] = [];
if (isScoped && DOM_CONTENT_COMMANDS.has(command)) {
  const page = session.getPage();
  try {
    const strippedDescs = await markHiddenElements(page);
    if (strippedDescs.length > 0) {
      hiddenContentWarnings = strippedDescs.slice(0, 8).map(d => `hidden content: ${d.slice(0, 120)}`);
      if (strippedDescs.length > 8) {
        hiddenContentWarnings.push(`hidden content: +${strippedDescs.length - 8} more flagged elements`);
      }
    }
    if (command === 'text') {
      result = await getCleanTextWithStripping(target);   // unchanged
    } else {
      result = await handleReadCommand(command, args, session, browserManager);
    }
  } finally {
    await cleanupHiddenMarkers(page);
  }
}

And the centralized wrap block folds those descriptions into combinedWarnings before the envelope call:

const combinedWarnings = [...filterResult.warnings, ...hiddenContentWarnings];
result = wrapUntrustedPageContent(
  result, command,
  combinedWarnings.length > 0 ? combinedWarnings : undefined,
);

text still gets physical stripping via getCleanTextWithStripping (no behavior change on that path). Every other scoped DOM channel keeps its output format unchanged and now also emits a ⚠ CONTENT WARNINGS: header listing the flagged hidden nodes. The LLM sees, for example:

⚠ CONTENT WARNINGS: hidden content: [div] opacity < 0.1: "IGNORE...";  hidden content: [button] ARIA injection: "System: you are..."
═══ BEGIN UNTRUSTED WEB CONTENT ═══
...actual channel output...
═══ END UNTRUSTED WEB CONTENT ═══

That turns the silent-leak into a visible, flagged payload the LLM can refuse.

What stayed the same

text command still uses getCleanTextWithStripping — visible hidden nodes are physically removed before the read, exactly as before.
console, dialog — not in DOM_CONTENT_COMMANDS, no detector runs, no behavior change. These read runtime state, not the DOM.
Root-token (non-scoped) calls — unchanged, same wrapUntrustedContent backward-compat path.
Sentinel strings, envelope shape, datamarking behavior, chain recursion guard — all unchanged.
No new dependencies. No config knobs.

Sibling review

Greped browse/src/ for every markHiddenElements or getCleanTextWithStripping call:

Location	Trigger	Change
`server.ts` read dispatch	was `command === 'text'`	now `DOM_CONTENT_COMMANDS.has(command)`
`content-security.ts` helpers	page DOM traversal	unchanged
any other call site	none	n/a

No other call site invokes the hidden-element detector. All scoped DOM-channel output now passes through the same gate.

Tests

browse/test/content-security.test.ts — new DOM-content channel coverage describe block, 5 cases:

commands.ts exports DOM_CONTENT_COMMANDS.
The set covers text, html, links, forms, accessibility, attrs, media, data, ux-audit, and does not include console or dialog.
The server's scoped-read dispatch gates on DOM_CONTENT_COMMANDS.has(command) and contains both markHiddenElements and cleanupHiddenMarkers (source-level lock — if a future refactor narrows the gate back to 'text', the test trips).
hiddenContentWarnings plumbs into combinedWarnings and reaches wrapUntrustedPageContent.
DOM_CONTENT_COMMANDS is a strict subset of PAGE_CONTENT_COMMANDS (runtime import check).

bun test browse/test/content-security.test.ts: 52 pass, 0 fail.

Negative control

Reverting browse/src/server.ts and browse/src/commands.ts to origin/main and rerunning:

DOM_CONTENT_COMMANDS import fails first (not exported yet).
The source-level regressions on server.ts fail.

Applying the fix: 52/52 pass.

Files

 browse/src/commands.ts               | 16 +++++++++
 browse/src/server.ts                 | 47 ++++++++++++++++++------
 browse/test/content-security.test.ts | 69 ++++++++++++++++++++++++++++++++++++
 3 files changed, 121 insertions(+), 11 deletions(-)

How to verify

git checkout security/confusion-protocol-channel-coverage
bun install
bun test browse/test/content-security.test.ts   # 52 pass
bun test                                           # full suite, exit 0

The Confusion Protocol envelope wrap (`wrapUntrustedPageContent`) covers every scoped PAGE_CONTENT_COMMAND, but the hidden-element ARIA-injection detection layer only ran for `text`. Other DOM-reading channels (html, links, forms, accessibility, attrs, data, media, ux-audit) returned their output through the envelope with no hidden- content filter, so a page serving a display:none div that instructs the agent to disregard prior system messages, or an aria-label that claims to put the LLM in admin mode, leaked the injection payload on any non-text channel. The envelope alone does not mitigate this, and the page itself never rendered the hostile content to the human operator. Fix: * New export `DOM_CONTENT_COMMANDS` in commands.ts — the subset of PAGE_CONTENT_COMMANDS that derives its output from the live DOM. Console and dialog stay out; they read separate runtime state. * server.ts runs `markHiddenElements` + `cleanupHiddenMarkers` for every scoped command in this set. `text` keeps its existing `getCleanTextWithStripping` path (hidden elements physically stripped before the read). All other channels keep their output format but emit flagged elements as CONTENT WARNINGS on the envelope, so the LLM sees what it would otherwise have consumed silently. * Hidden-element descriptions merge into `combinedWarnings` alongside content-filter warnings before the wrap call. Tests: new describe block in content-security.test.ts covering * `DOM_CONTENT_COMMANDS` export shape and channel membership; * dispatch gates on `DOM_CONTENT_COMMANDS.has(command)`, not the literal `text` string; * hiddenContentWarnings plumbs into `combinedWarnings` and reaches wrapUntrustedPageContent; * DOM_CONTENT_COMMANDS is a strict subset of PAGE_CONTENT_COMMANDS. Existing datamarking, envelope wrap, centralized-wrapping, and chain security suites stay green (52 pass, 0 fail).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security: extend hidden-element detection to every DOM-reading channel#1032

security: extend hidden-element detection to every DOM-reading channel#1032
garagon wants to merge 1 commit intogarrytan:mainfrom
garagon:security/confusion-protocol-channel-coverage

garagon commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garagon commented Apr 16, 2026

Summary

Reproduction (before the fix)

Root cause

Fix

What stayed the same

Sibling review

Tests

Negative control

Files

How to verify

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant