Skip to content

[backport] [world-vercel] Use stream control frame for transparent reconnection#1766

Merged
VaguelySerious merged 5 commits intostablefrom
peter/stream-reconnect-stable
Apr 16, 2026
Merged

[backport] [world-vercel] Use stream control frame for transparent reconnection#1766
VaguelySerious merged 5 commits intostablefrom
peter/stream-reconnect-stable

Conversation

@VaguelySerious
Copy link
Copy Markdown
Member

Summary

Backport of #1742 to stable.

  • Adds hold-back buffer to readFromStream that detects the 13-byte stream control frame appended by workflow-server on timeout
  • When the server signals a timeout (done=false), the client transparently reconnects from the next chunk index using ?controlFrame=1
  • Caps reconnections at 50 (~100 min of streaming at 2-min server timeout)
  • Network errors propagate to consumers via controller.error() instead of silently closing

Test plan

  • Unit tests for parseStreamControlFrame (8 tests)
  • Integration tests for reconnection, backward compat, and error propagation (3 tests)

🤖 Generated with Claude Code

Add hold-back buffer to readFromStream that detects the 13-byte stream
control frame appended by workflow-server on timeout. When the server
signals a timeout (done=false), the client transparently reconnects
from the next chunk index. Caps reconnections at 50 (~100 min).

Network errors propagate to consumers via controller.error() instead
of silently closing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@VaguelySerious VaguelySerious requested a review from a team as a code owner April 16, 2026 02:54
@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Apr 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
example-nextjs-workflow-turbopack Ready Ready Preview, Comment Apr 16, 2026 7:15pm
example-nextjs-workflow-webpack Ready Ready Preview, Comment Apr 16, 2026 7:15pm
example-workflow Ready Ready Preview, Comment Apr 16, 2026 7:15pm
workbench-astro-workflow Ready Ready Preview, Comment Apr 16, 2026 7:15pm
workbench-express-workflow Ready Ready Preview, Comment Apr 16, 2026 7:15pm
workbench-fastify-workflow Ready Ready Preview, Comment Apr 16, 2026 7:15pm
workbench-hono-workflow Ready Ready Preview, Comment Apr 16, 2026 7:15pm
workbench-nitro-workflow Ready Ready Preview, Comment Apr 16, 2026 7:15pm
workbench-nuxt-workflow Ready Ready Preview, Comment Apr 16, 2026 7:15pm
workbench-sveltekit-workflow Ready Ready Preview, Comment Apr 16, 2026 7:15pm
workbench-vite-workflow Ready Ready Preview, Comment Apr 16, 2026 7:15pm
workflow-docs Ready Ready Preview, Comment, Open in v0 Apr 16, 2026 7:15pm
workflow-swc-playground Ready Ready Preview, Comment Apr 16, 2026 7:15pm
workflow-web Ready Ready Preview, Comment Apr 16, 2026 7:15pm

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 16, 2026

🦋 Changeset detected

Latest commit: d78a4d4

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 18 packages
Name Type
@workflow/world-vercel Patch
@workflow/cli Patch
@workflow/core Patch
@workflow/web Patch
workflow Patch
@workflow/world-testing Patch
@workflow/builders Patch
@workflow/next Patch
@workflow/nitro Patch
@workflow/vitest Patch
@workflow/web-shared Patch
@workflow/ai Patch
@workflow/astro Patch
@workflow/nest Patch
@workflow/rollup Patch
@workflow/sveltekit Patch
@workflow/vite Patch
@workflow/nuxt Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 16, 2026

🧪 E2E Test Results

Some tests failed

Summary

Passed Failed Skipped Total
❌ ▲ Vercel Production 889 12 67 968
✅ 💻 Local Development 970 0 86 1056
✅ 📦 Local Production 970 0 86 1056
✅ 🐘 Local Postgres 970 0 86 1056
✅ 🪟 Windows 88 0 0 88
❌ 🌍 Community Worlds 16 68 0 84
✅ 📋 Other 246 0 18 264
Total 4149 80 343 4572

❌ Failed Tests

▲ Vercel Production (12 failed)

astro (1 failed):

  • importMetaUrlWorkflow - import.meta.url is available in step bundles | wrun_01KPBW3QV1YR1HYCBEQACJZXMC | 🔍 observability

fastify (1 failed):

  • instanceMethodStepWorkflow - instance methods with "use step" directive | wrun_01KPBW0RPQJWQD8XX6WT388358 | 🔍 observability

hono (1 failed):

  • outputStreamInsideStepWorkflow - getWritable() called inside step functions | wrun_01KPBVMG655K9S7SR3J0B4SPG3 | 🔍 observability

nextjs-turbopack (3 failed):

  • DurableAgent e2e core multiple sequential tool calls
  • addTenWorkflow | wrun_01KPBVDY4X04RNEQM3N84AWM63 | 🔍 observability
  • outputStreamWorkflow no startIndex (reads all chunks)

nextjs-webpack (3 failed):

  • wellKnownAgentWorkflow (.well-known/agent) | wrun_01KPBVEK284DTN2TSAQDEXZQTV | 🔍 observability
  • outputStreamWorkflow no startIndex (reads all chunks)
  • crossContextSerdeWorkflow - classes defined in step code are deserializable in workflow context | wrun_01KPBW18RXBXVAWVFY9EVH1DAJ | 🔍 observability

sveltekit (1 failed):

  • DurableAgent e2e core basic text response

vite (2 failed):

  • addTenWorkflow | wrun_01KPBVDY4X04RNEQM3N84AWM63 | 🔍 observability
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars) | wrun_01KPBVW9JQR18S4AYFFFHVBF92 | 🔍 observability
🌍 Community Worlds (68 failed)

mongodb-dev (1 failed):

  • dev e2e should rebuild on imported step dependency change

redis-dev (1 failed):

  • dev e2e should rebuild on imported step dependency change

turso-dev (1 failed):

  • dev e2e should rebuild on imported step dependency change

turso (65 failed):

  • addTenWorkflow | wrun_01KPBVDY4X04RNEQM3N84AWM63
  • addTenWorkflow | wrun_01KPBVDY4X04RNEQM3N84AWM63
  • wellKnownAgentWorkflow (.well-known/agent) | wrun_01KPBVEK284DTN2TSAQDEXZQTV
  • should work with react rendering in step
  • promiseAllWorkflow | wrun_01KPBVFS7XZNCF7EPS70TNTSC6
  • promiseRaceWorkflow | wrun_01KPBVFXSMG6DWV8H4E8373YZ9
  • promiseAnyWorkflow | wrun_01KPBVG018YH5WD53N66TW9EHY
  • importedStepOnlyWorkflow | wrun_01KPBVGRNKABXQH08KT40RFHNW
  • readableStreamWorkflow | wrun_01KPBVG260RNCWKZCWNHG155PA
  • hookWorkflow | wrun_01KPBVGCSQ4BWAXC6DAM138HA5
  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KPBVGR3K1FJX1P7Z18SKGVTG
  • webhookWorkflow | wrun_01KPBVH1RX158NTANJ3G67CAVY
  • sleepingWorkflow | wrun_01KPBVH7R2QTVWZSMHWRBAXDZP
  • parallelSleepWorkflow | wrun_01KPBVHM8KRQCHESSJ00121DVT
  • nullByteWorkflow | wrun_01KPBVHQJJ3NRATJJR0BSM9NBX
  • workflowAndStepMetadataWorkflow | wrun_01KPBVHT3SZHPAQHJHMPTFF0CV
  • outputStreamWorkflow no startIndex (reads all chunks)
  • outputStreamWorkflow positive startIndex (skips first chunk)
  • outputStreamWorkflow negative startIndex (reads from end)
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns correct index after stream completes
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns -1 before any chunks are written
  • outputStreamWorkflow - getTailIndex and getStreamChunks getStreamChunks returns same content as reading the stream
  • outputStreamInsideStepWorkflow - getWritable() called inside step functions | wrun_01KPBVMG655K9S7SR3J0B4SPG3
  • fetchWorkflow | wrun_01KPBVNGFM3Y4VR9963M7M5VDX
  • promiseRaceStressTestWorkflow | wrun_01KPBVP6Z5SKG0AKMYA1YBGH4P
  • error handling error propagation workflow errors nested function calls preserve message and stack trace
  • error handling error propagation workflow errors cross-file imports preserve message and stack trace
  • error handling error propagation step errors basic step error preserves message and stack trace
  • error handling error propagation step errors cross-file step error preserves message and function names in stack
  • error handling retry behavior regular Error retries until success
  • error handling retry behavior FatalError fails immediately without retries
  • error handling retry behavior RetryableError respects custom retryAfter delay
  • error handling retry behavior maxRetries=0 disables retries
  • error handling catchability FatalError can be caught and detected with FatalError.is()
  • error handling not registered WorkflowNotRegisteredError fails the run when workflow does not exist
  • error handling not registered StepNotRegisteredError fails the step but workflow can catch it
  • error handling not registered StepNotRegisteredError fails the run when not caught in workflow
  • hookCleanupTestWorkflow - hook token reuse after workflow completion | wrun_01KPBVTARB8H5MYMWDGMQKV130
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KPBVTZ8QY29TGD3X00R2SZY5
  • hookDisposeTestWorkflow - hook token reuse after explicit disposal while workflow still running | wrun_01KPBVVMYEQBMJM0Z4ACZWK16Y
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars) | wrun_01KPBVW9JQR18S4AYFFFHVBF92
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument | wrun_01KPBVY4NRHE6JN8FC3SF207QD
  • closureVariableWorkflow - nested step functions with closure variables | wrun_01KPBVYAF3BX06ENR9QV6FD2G7
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step | wrun_01KPBVYCJR9VSMY5AW9T7Y8EV9
  • health check (queue-based) - workflow and step endpoints respond to health check messages
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly | wrun_01KPBVYW5RFX6WX736FV6YCNVS
  • Calculator.calculate - static workflow method using static step methods from another class | wrun_01KPBVZ1WFVBKETMF6NB8R659D
  • AllInOneService.processNumber - static workflow method using sibling static step methods | wrun_01KPBVZ8WZWNMNE0WG23YT0ENV
  • ChainableService.processWithThis - static step methods using this to reference the class | wrun_01KPBW01PW0S7KND4T74KCFE18
  • thisSerializationWorkflow - step function invoked with .call() and .apply() | wrun_01KPBW08RZVQH6136T11JWJ96E
  • customSerializationWorkflow - custom class serialization with WORKFLOW_SERIALIZE/WORKFLOW_DESERIALIZE | wrun_01KPBW0HQSGQG9Y80A6V8XGA85
  • instanceMethodStepWorkflow - instance methods with "use step" directive | wrun_01KPBW0RPQJWQD8XX6WT388358
  • crossContextSerdeWorkflow - classes defined in step code are deserializable in workflow context | wrun_01KPBW18RXBXVAWVFY9EVH1DAJ
  • stepFunctionAsStartArgWorkflow - step function reference passed as start() argument | wrun_01KPBW1J3ZH787GXWV8PS9DRWX
  • cancelRun - cancelling a running workflow | wrun_01KPBW1V6Y7NZY48XHGJ0MC340
  • cancelRun via CLI - cancelling a running workflow | wrun_01KPBW2615R61B09JPQVVR93AV
  • pages router addTenWorkflow via pages router
  • pages router promiseAllWorkflow via pages router
  • pages router sleepingWorkflow via pages router
  • hookWithSleepWorkflow - hook payloads delivered correctly with concurrent sleep | wrun_01KPBW2J64D1PBDWZ3D42SVW7N
  • sleepInLoopWorkflow - sleep inside loop with steps actually delays each iteration | wrun_01KPBW36T2QYNVQ8MSSCTPYMVN
  • sleepWithSequentialStepsWorkflow - sequential steps work with concurrent sleep (control) | wrun_01KPBW3H2AC0V39WSHAJEFYNRF
  • importMetaUrlWorkflow - import.meta.url is available in step bundles | wrun_01KPBW3QV1YR1HYCBEQACJZXMC
  • metadataFromHelperWorkflow - getWorkflowMetadata/getStepMetadata work from module-level helper (#1577) | wrun_01KPBW3WEP3T5WPRMFKXVZZJRV
  • resilient start: addTenWorkflow completes when run_created returns 500 | wrun_01KPBW3YM7947XTQ24QE9J9BHE

Details by Category

❌ ▲ Vercel Production
App Passed Failed Skipped
❌ astro 80 1 7
✅ example 81 0 7
✅ express 81 0 7
❌ fastify 80 1 7
❌ hono 80 1 7
❌ nextjs-turbopack 83 3 2
❌ nextjs-webpack 83 3 2
✅ nitro 81 0 7
✅ nuxt 81 0 7
❌ sveltekit 80 1 7
❌ vite 79 2 7
✅ 💻 Local Development
App Passed Failed Skipped
✅ astro-stable 82 0 6
✅ express-stable 82 0 6
✅ fastify-stable 82 0 6
✅ hono-stable 82 0 6
✅ nextjs-turbopack-canary 69 0 19
✅ nextjs-turbopack-stable 88 0 0
✅ nextjs-webpack-canary 69 0 19
✅ nextjs-webpack-stable 88 0 0
✅ nitro-stable 82 0 6
✅ nuxt-stable 82 0 6
✅ sveltekit-stable 82 0 6
✅ vite-stable 82 0 6
✅ 📦 Local Production
App Passed Failed Skipped
✅ astro-stable 82 0 6
✅ express-stable 82 0 6
✅ fastify-stable 82 0 6
✅ hono-stable 82 0 6
✅ nextjs-turbopack-canary 69 0 19
✅ nextjs-turbopack-stable 88 0 0
✅ nextjs-webpack-canary 69 0 19
✅ nextjs-webpack-stable 88 0 0
✅ nitro-stable 82 0 6
✅ nuxt-stable 82 0 6
✅ sveltekit-stable 82 0 6
✅ vite-stable 82 0 6
✅ 🐘 Local Postgres
App Passed Failed Skipped
✅ astro-stable 82 0 6
✅ express-stable 82 0 6
✅ fastify-stable 82 0 6
✅ hono-stable 82 0 6
✅ nextjs-turbopack-canary 69 0 19
✅ nextjs-turbopack-stable 88 0 0
✅ nextjs-webpack-canary 69 0 19
✅ nextjs-webpack-stable 88 0 0
✅ nitro-stable 82 0 6
✅ nuxt-stable 82 0 6
✅ sveltekit-stable 82 0 6
✅ vite-stable 82 0 6
✅ 🪟 Windows
App Passed Failed Skipped
✅ nextjs-turbopack 88 0 0
❌ 🌍 Community Worlds
App Passed Failed Skipped
❌ mongodb-dev 4 1 0
❌ redis-dev 4 1 0
❌ turso-dev 4 1 0
❌ turso 4 65 0
✅ 📋 Other
App Passed Failed Skipped
✅ e2e-local-dev-nest-stable 82 0 6
✅ e2e-local-postgres-nest-stable 82 0 6
✅ e2e-local-prod-nest-stable 82 0 6

📋 View full workflow run


Some E2E test jobs failed:

  • Vercel Prod: failure
  • Local Dev: success
  • Local Prod: success
  • Local Postgres: success
  • Windows: success

Check the workflow run for details.

@VaguelySerious VaguelySerious changed the title [world-vercel] Use stream control frame for transparent reconnection [backport] [world-vercel] Use stream control frame for transparent reconnection Apr 16, 2026
Copy link
Copy Markdown
Member

@TooTallNate TooTallNate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code itself is a clean backport of the reconnection logic from #1742, but there's a fundamental protocol mismatch that makes it non-functional.

Re: why not use the backport Action: The repo has a Backport to stable workflow (.github/workflows/backport.yml) that triggers on merged PRs with the backport-stable label. PR #1742 has no labels and hasn't been merged yet, so the action was never triggered. Once #1742 is merged with the backport-stable label, the action would cherry-pick it automatically (or use OpenCode to resolve conflicts if the cherry-pick fails). That said, cherry-picking #1742 would have the same protocol mismatch described below, so a manual backport with adaptation is the right call — it just needs a different approach to signal control frame support.

Changes reviewed:

  1. parseStreamControlFrame, concatUint8Arrays, STREAM_CONTROL_FRAME_SIZE — identical to main, correct.
  2. Hold-back buffer + reconnection loop in readFromStream — identical logic to main's streams.get, with MAX_RECONNECTS=50 and controller.error(err) on network errors. All good.
  3. Unit tests for parseStreamControlFrame — identical to main, 8 tests. Good.
  4. Integration tests for reconnection — 3 tests (reconnect, backward compat, error propagation). Good coverage.
  5. Changeset — correct.

});
if (!response.ok) {
throw new Error(`Failed to fetch stream: ${response.status}`);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking: ?controlFrame=1 is not supported by workflow-server. The server decides whether to emit the control frame based solely on the API version from the URL path:

// workflow-server/lib/handlers/stream.ts:420
const enableControlFrame = getApiVersion(req) >= 3;

getApiVersion() is set by the v3 Hono sub-app middleware when the request path starts with /api/v3/. A query parameter has no effect.

On main, PR #1742 solved this by switching getStreamUrl to use /v3/runs/:runId/stream/:name. But on stable, readFromStream uses the deprecated route /v2/stream/:name (no runId), and the v3 API doesn't register that deprecated route — only /v3/runs/:runId/stream/:streamId exists.

So the control frame will never be emitted for this code path, and the reconnection logic is dead code.

Options to fix:

  1. If readFromStream on stable has access to a runId somewhere (even via a different caller), thread it through and use /v3/runs/:runId/stream/:name
  2. Add server-side support for ?controlFrame=1 on the v2 deprecated stream route (a small change in getStreamHandler)
  3. Register the deprecated /stream/:streamId route on the v3 sub-app in app.ts (one line)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 8781b5f — switched to /v3/stream/:name (option 3). Companion server PR: vercel/workflow-server#387 registers the deprecated stream route on the v3 sub-app.

@TooTallNate
Copy link
Copy Markdown
Member

I suppose you did a "manual" backport because the World API is different on stable (readFromStream, etc.). Still though, I'd be curious to see how the AI does at trying to backport #1742 from the GH Action 😄

The server enables control frames based on API version (v3+), not a
query param. Use the v3 deprecated stream route which will be
registered on the server via vercel/workflow-server#387.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread packages/world-vercel/src/streamer.ts Outdated
VaguelySerious and others added 2 commits April 16, 2026 12:11
Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
Signed-off-by: Peter Wielander <mittgfu@gmail.com>
@VaguelySerious VaguelySerious merged commit 3737caa into stable Apr 16, 2026
77 of 86 checks passed
@VaguelySerious VaguelySerious deleted the peter/stream-reconnect-stable branch April 16, 2026 19:33
VaguelySerious added a commit that referenced this pull request Apr 17, 2026
VaguelySerious added a commit that referenced this pull request Apr 17, 2026
… with streaming hold-back buffer

Re-applies #1742's reconnect feature on stable, without the arrayBuffer()
rewrite that shipped in 4.2.3 (#1766) and defeated incremental streaming.

- readFromStream returns the ReadableStream promptly; a hold-back buffer
  inside the pull loop holds only the last 13 bytes, forwarding surplus
  bytes as they arrive. This preserves incremental delivery for AI UIs.
- On upstream close, the tail is parsed for a stream control frame
  (done=true → close; done=false → reconnect from nextIndex; no frame →
  forward tail as data for older servers).
- Network errors propagate via controller.error() rather than silent close.
- Caps reconnections at 50 (~100 min of streaming at 2-min server timeout).

Adds a regression test that asserts data reaches the consumer before
upstream close — any buffer-then-replay implementation fails it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VaguelySerious added a commit that referenced this pull request Apr 17, 2026
… with streaming hold-back buffer

Re-applies #1742's reconnect feature on stable, without the arrayBuffer()
rewrite that shipped in 4.2.3 (#1766) and defeated incremental streaming.

- readFromStream returns the ReadableStream promptly; a hold-back buffer
  inside the pull loop holds only the last 13 bytes, forwarding surplus
  bytes as they arrive. This preserves incremental delivery for AI UIs.
- On upstream close, the tail is parsed for a stream control frame
  (done=true → close; done=false → reconnect from nextIndex; no frame →
  forward tail as data for older servers).
- Network errors propagate via controller.error() rather than silent close.
- Caps reconnections at 50 (~100 min of streaming at 2-min server timeout).

Adds a regression test that asserts data reaches the consumer before
upstream close — any buffer-then-replay implementation fails it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Peter Wielander <mittgfu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants