security: extend hidden-element detection to every DOM-reading channel#1032
Open
garagon wants to merge 1 commit intogarrytan:mainfrom
Open
security: extend hidden-element detection to every DOM-reading channel#1032garagon wants to merge 1 commit intogarrytan:mainfrom
garagon wants to merge 1 commit intogarrytan:mainfrom
Conversation
The Confusion Protocol envelope wrap (`wrapUntrustedPageContent`)
covers every scoped PAGE_CONTENT_COMMAND, but the hidden-element
ARIA-injection detection layer only ran for `text`. Other DOM-reading
channels (html, links, forms, accessibility, attrs, data, media,
ux-audit) returned their output through the envelope with no hidden-
content filter, so a page serving a display:none div that instructs
the agent to disregard prior system messages, or an aria-label that
claims to put the LLM in admin mode, leaked the injection payload on
any non-text channel. The envelope alone does not mitigate this, and
the page itself never rendered the hostile content to the human
operator.
Fix:
* New export `DOM_CONTENT_COMMANDS` in commands.ts — the subset of
PAGE_CONTENT_COMMANDS that derives its output from the live DOM.
Console and dialog stay out; they read separate runtime state.
* server.ts runs `markHiddenElements` + `cleanupHiddenMarkers` for
every scoped command in this set. `text` keeps its existing
`getCleanTextWithStripping` path (hidden elements physically
stripped before the read). All other channels keep their output
format but emit flagged elements as CONTENT WARNINGS on the
envelope, so the LLM sees what it would otherwise have consumed
silently.
* Hidden-element descriptions merge into `combinedWarnings`
alongside content-filter warnings before the wrap call.
Tests: new describe block in content-security.test.ts covering
* `DOM_CONTENT_COMMANDS` export shape and channel membership;
* dispatch gates on `DOM_CONTENT_COMMANDS.has(command)`, not the
literal `text` string;
* hiddenContentWarnings plumbs into `combinedWarnings` and reaches
wrapUntrustedPageContent;
* DOM_CONTENT_COMMANDS is a strict subset of PAGE_CONTENT_COMMANDS.
Existing datamarking, envelope wrap, centralized-wrapping, and chain
security suites stay green (52 pass, 0 fail).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The Confusion Protocol envelope wrap already covers every scoped
PAGE_CONTENT_COMMAND—text,html,links,forms,accessibility,attrs,console,dialog,media,data,ux-audit. The hidden-element / ARIA-injection detection layer (markHiddenElements+getCleanTextWithStrippingincontent-security.ts) only ran whencommand === 'text'. For every other DOM-reading channel the output went through the envelope with no hidden-content filter.Net effect: a page that carries a
display:nonediv with payload text, or a button with anaria-labelmatching one of the injection patterns, has that payload leak to the LLM the moment the agent callsbrowse html,browse accessibility,browse attrs, or any of the other non-text channels. The envelope wrap by itself does not mitigate this — it only tells the model that content is untrusted, not that the visible page never actually rendered those bytes to the human operator.Reproduction (before the fix)
The PoC simulates the dispatcher against a page that renders a hidden injection div plus an aria-label injection, then walks every
PAGE_CONTENT_COMMAND. With the current code, every non-text channel emits the hidden payload inside the envelope. Expected after the fix: no channel returns the raw payload without a CONTENT WARNINGS header flagging it.Root cause
browse/src/server.tsaround the read dispatch:The condition gates on the literal
'text'string. Every other scoped channel falls through tohandleReadCommandand never touches the hidden-element detector. The envelope wrap later in the same handler does not know which content came from hidden nodes, so it cannot flag them.Fix
Two parts, each small on its own.
New export
DOM_CONTENT_COMMANDSinbrowse/src/commands.ts. Subset ofPAGE_CONTENT_COMMANDSwhose output is derived from the live DOM tree:text,html,links,forms,accessibility,attrs,media,data,ux-audit.consoleanddialogstay out — they read separate runtime state (captured console output, queued dialog events), so running the DOM detector would be wasteful and potentially racy against navigation.Dispatcher gates on the set, not on
text.browse/src/server.tsnow does:And the centralized wrap block folds those descriptions into
combinedWarningsbefore the envelope call:textstill gets physical stripping viagetCleanTextWithStripping(no behavior change on that path). Every other scoped DOM channel keeps its output format unchanged and now also emits a⚠ CONTENT WARNINGS:header listing the flagged hidden nodes. The LLM sees, for example:That turns the silent-leak into a visible, flagged payload the LLM can refuse.
What stayed the same
textcommand still usesgetCleanTextWithStripping— visible hidden nodes are physically removed before the read, exactly as before.console,dialog— not inDOM_CONTENT_COMMANDS, no detector runs, no behavior change. These read runtime state, not the DOM.wrapUntrustedContentbackward-compat path.Sibling review
Greped
browse/src/for everymarkHiddenElementsorgetCleanTextWithStrippingcall:server.tsread dispatchcommand === 'text'DOM_CONTENT_COMMANDS.has(command)content-security.tshelpersNo other call site invokes the hidden-element detector. All scoped DOM-channel output now passes through the same gate.
Tests
browse/test/content-security.test.ts— newDOM-content channel coveragedescribe block, 5 cases:commands.tsexportsDOM_CONTENT_COMMANDS.text,html,links,forms,accessibility,attrs,media,data,ux-audit, and does not includeconsoleordialog.DOM_CONTENT_COMMANDS.has(command)and contains bothmarkHiddenElementsandcleanupHiddenMarkers(source-level lock — if a future refactor narrows the gate back to'text', the test trips).hiddenContentWarningsplumbs intocombinedWarningsand reacheswrapUntrustedPageContent.DOM_CONTENT_COMMANDSis a strict subset ofPAGE_CONTENT_COMMANDS(runtime import check).bun test browse/test/content-security.test.ts: 52 pass, 0 fail.Negative control
Reverting
browse/src/server.tsandbrowse/src/commands.tstoorigin/mainand rerunning:DOM_CONTENT_COMMANDSimport fails first (not exported yet).server.tsfail.Applying the fix: 52/52 pass.
Files
How to verify