Add receiver-side resource limits for untrusted peers (#184) by ndisidore · Pull Request #185 · cloudflare/capnweb

ndisidore · 2026-05-30T20:01:46Z

Summary

Deserializing untrusted input could exhaust a peer's resources. Most concretely (#184), a ["bigint", "<digits>"] wire value was handed straight to BigInt() with no length check, and BigInt() decimal parsing is superlinear — so a multi-megabyte digit string blocks the event loop. While fixing that, the receive path is also bounded against the related single-message exhaustion vectors the issue calls out ("review that we're properly limiting message sizes overall").

The guards are local, receiver-side decisions that fit capnweb's existing design: no new dependencies, no protocol changes, and reuse of the existing abort path on violation.

What this changes

bigint length cap + numeric validation — reject digit strings longer than the limit, and reject anything that isn't an optional sign followed by ASCII decimal digits, before calling BigInt().
Message nesting depth guard — evaluateImpl now tracks recursion depth and rejects messages nested beyond the limit (mirroring the existing send-side depth cap), so a tiny deeply-nested message can't overflow the stack.
Maximum message size — incoming messages larger than the limit are rejected before JSON.parse, bounding the up-front allocation.

Configuration

Limits are always applied from the exported DEFAULT_LIMITS, and can be overridden per session via a new optional limits field on RpcSessionOptions:

new RpcSession(transport, target, {
  limits: { maxBigIntDigits: 4096, maxDepth: 32, maxMessageSize: 8 * 1024 * 1024 },
});

Defaults:

maxBigIntDigits: 16384 — 2^14 digits ≈ ~54,000 bits, far beyond any practical cryptographic integer (an RSA-16384 modulus is ~4,933 decimal digits), so no legitimate value is rejected — yet orders of magnitude below the millions of digits needed for BigInt()'s superlinear parse to block meaningfully.
maxDepth: 64 — matches the existing send-side depth limit, so capnweb never rejects a message it would itself send.
maxMessageSize: 32 MiB — generous for legitimate batched and base64/blob payloads while still bounding a single allocation.

Because the protocol has no negotiation step and a limit is a purely local receiver-side decision, the defaults are deliberately generous: a sender cannot discover the receiver's limit, so the defaults must never trip legitimate traffic. They are tunable per session for endpoints with unusual needs.

Behavior on violation

Exceeding any limit throws, which flows through the existing readLoop → abort() path: the session sends an ["abort", ...] frame and tears down. The thrown errors name the limit and its value, so a cooperative peer sees a descriptive reason on the aborted session — riding the abort frame that is already sent, with no protocol change.

Alternatives considered

Hardcoded constants with no override — simplest, and still covers the standalone deserialize() path, but leaves no escape hatch for applications that legitimately exchange very large payloads. Kept the hardcoded defaults as the floor, but made them overridable.
Global / static configuration — rejected: there is no precedent in the codebase (configuration is per-session), it would leak one application's setting across unrelated sessions, and it is awkward inside Workers isolates.
Protocol handshake to advertise limits to the peer — rejected: it adds a round trip (or is structurally impossible for the one-shot HTTP batch transport), undermines promise pipelining by forcing an await before the first call, and adds no security — an untrusted peer simply ignores advertised limits, so the receiver must enforce locally regardless.
App-level getLimits() RPC method — purely a cooperative-peer ergonomic that needs no library change (an application can already expose such a method); it is not a substitute for local enforcement, so it is left to applications that want it.

Deferred (potential follow-ups)

These are larger resource-exhaustion concerns from the issue discussion that need rate/quota or backpressure design rather than a per-message guard, and are intentionally out of scope here:

Unbounded growth of the imports/exports tables from a flood of push / stream / pipe messages (the "many messages → OOM" case) — needs per-session quotas and backpressure.
Attacker-chosen import indices causing sparse-array bloat.
remap instruction/capture fan-out producing multiplicative work.
Streams and blobs buffered fully in memory.

Note: nested call-arguments are deserialized as a fresh payload with a fresh depth budget, so a message that nests deeply through call arguments can still recurse beyond maxDepth. This fails safe — it is bounded by maxMessageSize and produces a clean session abort (a caught RangeError), not a process crash (verified) — and belongs to the same follow-up bucket.

Testing

New __tests__/limits.test.ts (14 tests): just-under / over boundaries for each limit, non-numeric bigint rejection, per-session override enforcement, the standalone deserialize() protected by defaults, and backwards-compatibility — sessions constructed with no options, and with empty options, round-trip normal bigints / nesting / calls unchanged.
Full suite green: node 156 passed, workerd 145 passed, no regressions.

changeset-bot · 2026-05-30T20:01:51Z

🦋 Changeset detected

Latest commit: ebee288

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
capnweb	Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

github-actions · 2026-05-30T20:01:57Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

pkg-pr-new · 2026-05-30T20:02:16Z

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/capnweb@185

commit: ebee288

ndisidore · 2026-06-01T13:16:21Z

I have read the CLA Document and I hereby sign the CLA

ndisidore · 2026-06-01T13:50:32Z

+              throw new TypeError(
+                  `Deserialized bigint exceeds maximum length of ${maxBigIntDigits} digits.`);
+            }
+            // Reject anything BigInt() would parse expensively or unexpectedly before handing it


To be discussed:
I think this was a bug/unintentional behavior before.

protocol.md:181: "A bigint value, represented as a decimal string."

It is technically possible now to handcraft ["bigint","0xff"] which unconditionally calls BigInt(value[1]) meaning this gets parsed as 255n.
But this means there is a disconnect in the protocol writeup and the actual implementation

Hmm, I'm tempted to allow, and maybe even require, the hex format, because it'll obviously be much more efficient to parse. Parsing hex is strictly O(n) so you might even argue we don't need a limit then (though we might as well keep it anyway).

It's nice that switching to generating hex is backwards-compatible...

Done. Changed to now always emit hex, but it can still consume binary (to allow backwards compat)

am I understanding correctly? in the case of a newer server talking to an older client:

before this PR: serialize(-255n) produces ["bigint","-255"]. Old receivers call BigInt("-255") --> works fine.

after this PR: serialize(-255n) produces ["bigint","-0xff"]. Old receivers call BigInt("-0xff") --> throws SyntaxError, aborting the session.

so technically this is a major break? (well, semver 0.x bump, in our pre-1.0 case).

kentonv · 2026-06-02T13:26:02Z

+      // Bound the up-front allocation of a single message before parsing it. This is a local,
+      // receiver-side guard against an untrusted peer sending an enormous message; a throw here
+      // propagates out of readLoop and aborts the session.
+      if (msgText.length > this.limits.maxMessageSize) {


Feels like it would benefit the underlying transport to know about message size limits so that it could avoid even reading more bytes than the limit into memory.

Could be passed into receive() but some transports would probably want to know in the constructor?

Maybe that just means it is up to the app to pass in when they are constructing such a transport.

I wonder if any WebSocket implementations support specifying limits. You'd think they would since otherwise sending enormous WebSocket frames seems like an easy way to exhaust anyone's memory...

So "plumb the limit into transports" decomposes into three quite different problems:

Plumbing: how does the limit even reach a transport? Transports are frequently built by the app and handed to new RpcSession(transport, …), so options.limits can't reach a pre-built transport. Only the newWebSocketRpcSession / newHttpBatchRpcResponse helpers (which construct the transport internally) could thread it. the raw constructor path can't. That's an interface-design decision with back-compat implications

The socket-based transports (the untrusted-facing ones) can't be enforced by capnweb at all: the platform reads and buffers the frame before capnweb's listener runs. The only real lever is the underlying socket's native limit: Node ws's maxPayload, Bun's maxPayloadLength, or the runtime's own cap (Cloudflare Workers, for instance, already imposes a built-in WebSocket message-size limit). All of those are app-configured when the socket is created (you mention landing on this as well)

HTTP batch is the one place capnweb genuinely controls the read, so it's the only transport where the real value add is, and even there it means replacing the clean .text() with a manual ReadableStream read that counts bytes and aborts past the cap (plus disambiguating batch size vs per-message size, since a batch is newline-joined messages in one body).

My read is this ultimatley boils down to touching the RpcTransport interface, all 5 transports, the session-helper entry points, and docs/tests, all while still being unable to protect the most exposed case (a WebSocket server) from capnweb's own code, because that protection physically lives in the socket the app constructs.

I'm going to suggest we leave this to a followup pr

Fair summary. I was thinking something along the lines of passing the limits into the transport constructors. Of course, if the app constructs its own transport, it is its own job to pass limits to that.

But I suppose as you point out the standard WebSocket API doesn't actually give us any ability to configure max incoming message size... we'd have to use non-standard settings if anything... bleh. Maybe leave it up to the app.

kentonv · 2026-06-02T13:35:15Z

+              throw new TypeError(
+                  `Deserialized bigint exceeds maximum length of ${maxBigIntDigits} digits.`);
+            }
+            // Reject anything BigInt() would parse expensively or unexpectedly before handing it


Hmm, I'm tempted to allow, and maybe even require, the hex format, because it'll obviously be much more efficient to parse. Parsing hex is strictly O(n) so you might even argue we don't need a limit then (though we might as well keep it anyway).

It's nice that switching to generating hex is backwards-compatible...

Deserializing messages from an untrusted peer could exhaust CPU, memory, or the stack: - A long [bigint,...] digit string fed straight to BigInt() blocks the event loop, because BigInt()'s decimal parse is superlinear. - A deeply nested message could overflow the stack during evaluation. - An arbitrarily large message was JSON-parsed with no up-front size bound. Add three local, receiver-side guards in the deserialization path: - Cap bigint digit length and reject non-numeric strings before BigInt(). - Bound nesting depth in Evaluator.evaluateImpl, mirroring the existing send-side depth limit in Devaluator. - Reject oversized messages in the session read loop before JSON.parse. Limits have safe exported defaults (DEFAULT_LIMITS) and are overridable per session via a new RpcLimits-typed 'limits' field on RpcSessionOptions, surfaced to the Evaluator through the Importer interface so the standalone deserialize() path stays protected by the defaults. Limits are purely local (the protocol has no negotiation step); exceeding one throws, which aborts the session via the existing abort path.

kentonv · 2026-06-10T18:57:44Z

+"capnweb": minor
+---
+
+Add receiver-side resource limits to guard against untrusted-peer resource exhaustion (#184). Deserialization now caps the length of `bigint` values, accepts both emitted hex strings and legacy decimal strings, bounds message nesting depth across nested call arguments, and rejects oversized incoming messages before parsing. The limits are local, receiver-side decisions with safe defaults (`DEFAULT_LIMITS`), and can be overridden per session via the new `limits` field on `RpcSessionOptions`. Exceeding a limit aborts the session, reusing the existing abort path.


Hmm, I think the changeset description is really meant to be a one-liner, it shows up in a bullet list in the release notes. I actually don't know how multiple paragraphs will render.

dimitropoulos · 2026-06-13T19:53:44Z

+function nestedCallArgs(depth: number): string {
+  let s = "1";
+  for (let i = 0; i < depth; i++) {
+    s = `["pipeline",0,["echo"],[${s}]]`;
+  }
+  return s;
+}


Suggested change

function nestedCallArgs(depth: number): string {

let s = "1";

for (let i = 0; i < depth; i++) {

s = `["pipeline",0,["echo"],[${s}]]`;

}

return s;

}

function nestedCallArgs(depth: number): string {

return `["pipeline",0,["echo"],[`.repeat(depth) + "1" + "]]".repeat(depth);

}

same as the above - I think we can avoid the loop assignment with this

dimitropoulos · 2026-06-13T19:55:48Z

+function nestedEscapedArrays(depth: number): string {
+  let s = "1";
+  for (let i = 0; i < depth; i++) {
+    s = `[[${s}]]`;
+  }
+  return s;
+}


Suggested change

function nestedEscapedArrays(depth: number): string {

let s = "1";

for (let i = 0; i < depth; i++) {

s = `[[${s}]]`;

}

return s;

}

function nestedEscapedArrays(depth: number): string {

return `${"[[".repeat(depth)}1${"]]".repeat(depth)}`;

}

maybe I'm misunderstanding the algorithm, but I'm pretty sure it can be this? that way we can avoid needing to do the assignment in a loop. not like it's a big deal or anything

dimitropoulos · 2026-06-13T20:07:12Z

+      // buffered the complete string; true pre-read enforcement belongs in the transport/socket.
+      // This backstop still prevents oversized messages from reaching JSON.parse and downstream
+      // deserialization work. A throw here propagates out of readLoop and aborts the session.
+      if (msgText.length > this.limits.maxMessageSize) {


do we need to be thinking in terms of bytes not characters? so like

// 50 emoji = 100 UTF-16 code units (.length), but 200 bytes in UTF-8 // PASSES the check despite being 2x the "limit" in actual memory const msg = "😀".repeat(50); console.log({ chars: msg.length, // 100 bytes: new TextEncoder().encode(msg).byteLength, // 200 }); // Conversely, 100 ASCII chars = 100 code units AND 100 bytes const msg2 = "a".repeat(100); console.log({ chars: msg2.length, // 100 bytes: new TextEncoder().encode(msg2).byteLength, // 100 })

dimitropoulos · 2026-06-13T20:09:04Z

+    let aborted = await pushFrame(overLimit, { maxBigIntDigits: 4 }).waitForAbort();
+    expect(aborted).toMatch(/bigint exceeds maximum length/);
+
+    // The same 5-digit bigint is well under the default limit, so it does NOT abort by default.


but it's not a 5-digit, it's a 3-digit hex number with a 0x prefix (not part of the number). In general, I feel like it'd be pretty reasonable to not consider the 0x in counting digits.

and in reading the section about handling negative makes me wonder if the negative sign would also wrongly be counted as a digit, too.

dimitropoulos · 2026-06-13T20:11:05Z

+              return BigInt(digits);
+            } else if (/^-?0x[0-9a-fA-F]+$/.test(digits)) {
+              let isNegative = digits.startsWith("-");
+              let magnitude = BigInt(isNegative ? digits.slice(1) : digits);


wow, I didn't know that BigInt("-0xf") throws. huh!

dimitropoulos

some questions

github-actions Bot added a commit that referenced this pull request Jun 1, 2026

@ndisidore has signed the CLA in #185

74799f6

ndisidore commented Jun 1, 2026

View reviewed changes

kentonv reviewed Jun 2, 2026

View reviewed changes

ndisidore force-pushed the fix/184-untrusted-resource-limits branch from a6ac783 to ebee288 Compare June 8, 2026 17:49

kentonv reviewed Jun 10, 2026

View reviewed changes

kentonv approved these changes Jun 10, 2026

View reviewed changes

dimitropoulos self-requested a review June 13, 2026 19:49

dimitropoulos reviewed Jun 13, 2026

View reviewed changes

Conversation

ndisidore commented May 30, 2026

Summary

What this changes

Configuration

Behavior on violation

Alternatives considered

Deferred (potential follow-ups)

Testing

Uh oh!

changeset-bot Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

github-actions Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pkg-pr-new Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ndisidore commented Jun 1, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimitropoulos Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimitropoulos Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimitropoulos Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimitropoulos left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

changeset-bot Bot commented May 30, 2026 •

edited

Loading

github-actions Bot commented May 30, 2026 •

edited

Loading

pkg-pr-new Bot commented May 30, 2026 •

edited

Loading

dimitropoulos Jun 13, 2026 •

edited

Loading

dimitropoulos Jun 13, 2026 •

edited

Loading

dimitropoulos Jun 13, 2026 •

edited

Loading