[world-vercel] Always try JSON queue transport as a fallback#1717
[world-vercel] Always try JSON queue transport as a fallback#1717VaguelySerious wants to merge 2 commits intomainfrom
Conversation
Signed-off-by: Peter Wielander <[email protected]>
🦋 Changeset detectedLatest commit: 00bea6d The changes in this PR will be included in the next version bump. This PR includes changesets to release 17 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
📊 Benchmark Results
workflow with no steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Nitro | Express workflow with 1 step💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Next.js (Turbopack) | Express workflow with 10 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) workflow with 25 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) workflow with 50 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) Promise.all with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) | Nitro Promise.all with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Nitro | Express Promise.all with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) Promise.race with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) | Nitro Promise.race with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Nitro | Express Promise.race with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) workflow with 10 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) | Nitro workflow with 25 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) workflow with 50 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) workflow with 10 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Next.js (Turbopack) | Nitro workflow with 25 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) workflow with 50 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) Stream Benchmarks (includes TTFB metrics)workflow with stream💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) stream pipeline with 5 transform steps (1MB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Next.js (Turbopack) | Express | Nitro 10 parallel streams (1MB each)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) fan-out fan-in 10 streams (1MB each)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) SummaryFastest Framework by WorldWinner determined by most benchmark wins
Fastest World by FrameworkWinner determined by most benchmark wins
Column Definitions
Worlds:
|
🧪 E2E Test Results❌ Some tests failed Summary
❌ Failed Tests🌍 Community Worlds (74 failed)mongodb (7 failed):
redis (7 failed):
turso (60 failed):
Details by Category✅ ▲ Vercel Production
✅ 💻 Local Development
✅ 📦 Local Production
✅ 🐘 Local Postgres
✅ 🪟 Windows
❌ 🌍 Community Worlds
✅ 📋 Other
|
…es unhandled to the caller, crashing the workflow instead of being silently handled. This commit fixes the issue reported at packages/world-vercel/src/queue.ts:260 **Bug Explanation:** In `packages/world-vercel/src/queue.ts`, the `queue` function sends messages via CBOR transport and falls back to JSON transport on failure. The error handling logic catches `DuplicateMessageError` from the primary CBOR `send()` call, but the JSON fallback `send(jsonTransport)` on line 263 is not wrapped in any error handling. The scenario where this manifests: 1. `send(cborTransport)` is called with an `idempotencyKey` in `sendOpts` 2. The server successfully receives and processes the message 3. The CBOR response parsing fails (e.g., network issue, malformed response), throwing a non-`DuplicateMessageError` 4. The catch block falls through the `DuplicateMessageError` check and enters the `if (useCbor)` fallback 5. `send(jsonTransport)` is called with the same `sendOpts` including the same `idempotencyKey` 6. The server already has this message, so it throws `DuplicateMessageError` 7. This error propagates completely unhandled to the caller, crashing the workflow This is a regression from the design intent. The code explicitly handles `DuplicateMessageError` for the primary path with a comment saying "Silently handle idempotency key conflicts - the message was already queued. This matches the behavior of world-local and world-postgres." The JSON fallback should have the same handling. **Fix Explanation:** Wrapped the `send(jsonTransport)` fallback call in its own try-catch block that handles `DuplicateMessageError` identically to the primary path — returning a placeholder `messageId` with the `msg_duplicate_` prefix. Any other errors from the fallback are re-thrown as before. This ensures consistent behavior regardless of which transport path encounters a duplicate message. Co-authored-by: Vercel <vercel[bot]@users.noreply.github.com> Co-authored-by: VaguelySerious <[email protected]>
TooTallNate
left a comment
There was a problem hiding this comment.
Review
The refactor to extract the send() helper is clean, and the secondary DuplicateMessageError handler in the fallback path is a correct fix for the bug described in commit 2. The title change ("Always try JSON queue transport as a fallback") is where I have concerns.
Concern: silent duplicate enqueue when idempotencyKey is absent
Most world.queue() callers in the runtime don't pass an idempotencyKey. A quick search shows only suspension-handler.ts:275 sets one (for step dispatch). All other sites — workflow re-enqueue from step-handler, start, runs, resume-hook — don't.
For those callers, the fallback path is unsafe when the CBOR send partially succeeded:
send(cborTransport)→ server receives message, ingests it, begins returning response- Client fails to parse response (network blip, malformed response, etc.)
- CBOR send throws (not
DuplicateMessageError) send(jsonTransport)is called with the samesendOpts, butidempotencyKeyisundefined- Server has no way to detect this is a duplicate — it enqueues a second copy
The resulting duplicate workflow re-enqueue would be deduped by the workflow's own event log (concurrent handlers racing on run_started return EntityConflictError), so the impact is just extra queue noise — but it's still strictly worse behavior than before this PR for those callers.
Suggestions
Pick one:
-
Only fall back when idempotencyKey is set (safest):
if (useCbor && sendOpts.idempotencyKey) { try { return await send(jsonTransport); } catch ... }
Non-idempotent callers continue to get the old behavior (throw). This is a strict improvement — you fix the bug without introducing the duplicate-send footgun.
-
Auto-generate an idempotencyKey when one isn't provided, used only for this fallback path. Something like a random UUID reused across the CBOR→JSON retry. This preserves the fallback for all callers.
-
Narrow the fallback to specific error types — only fall back on CBOR-specific errors (decode failures), not generic network errors. This way, "server received but client couldn't parse response" scenarios still throw as before, but actual CBOR incompatibility errors get the fallback. Hard to implement reliably without knowing what errors
cbor-xcan throw.
I'd go with option 1. The original bug from commit 2 (JSON fallback throwing unhandled DuplicateMessageError) is correctly fixed either way.
Minor nits
- The changeset message ("Fall back to JSON queue transport if CBOR parsing fails") is misleading — the fallback happens on any non-
DuplicateMessageError, not just CBOR parsing failures. fallbackError.idempotencyKey ?? opts?.idempotencyKey ?? 'unknown'— the'unknown'branch is reachable only when no idempotencyKey was set, which is exactly the concerning case above. A message with IDmsg_duplicate_unknownis a code smell.
|
Follow-up after getting context on the intent: I understand this is meant to address the rolling-deploy scenario where a new sender (CBOR) enqueues a message that might be picked up by an old handler instance (pre- If that's the goal, this diff doesn't actually achieve it. The send-side try/catch only observes errors from the
So If rolling-deploy compatibility is the actual concern, a better approach would be one of:
The existing That said, if my understanding of the intent is off, let me know. Or if the real goal is just defensive belt-and-suspenders for CBOR encoding failures, the PR does that — in which case the title/changeset message should be reworded. |
No description provided.