[core] Gate turbo end-of-run drain writes on the run-ready barrier#2685
Open
VaguelySerious wants to merge 2 commits into
Open
[core] Gate turbo end-of-run drain writes on the run-ready barrier#2685VaguelySerious wants to merge 2 commits into
VaguelySerious wants to merge 2 commits into
Conversation
A workflow that creates a fire-and-forget hook (or wait/attribute) and then returns synchronously never suspends, so the `*_created` event is committed by the end-of-run drain inside `runWorkflow` — before the runtime's terminal `awaitRunReady()`. In turbo (`run_started` backgrounded), that write could reach the server before the run exists. Thread the run-ready barrier through `runWorkflow` into the drain's `handleSuspension` call so these writes are gated the same way the normal suspension path already is. No-op outside turbo (barrier undefined). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🦋 Changeset detectedLatest commit: 74b443b The changes in this PR will be included in the next version bump. This PR includes changesets to release 16 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Contributor
Contributor
📊 Benchmark Results
workflow with no steps💻 Local Development
▲ Production (Vercel)
workflow with 1 step💻 Local Development
▲ Production (Vercel)
workflow with 10 sequential steps💻 Local Development
▲ Production (Vercel)
workflow with 25 sequential steps💻 Local Development
▲ Production (Vercel)
workflow with 50 sequential steps💻 Local Development
▲ Production (Vercel)
Promise.all with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
Promise.all with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
Promise.all with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
Promise.race with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
Promise.race with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
Promise.race with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
workflow with 10 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
workflow with 25 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
workflow with 50 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
workflow with 10 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
workflow with 25 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
workflow with 50 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
Stream Benchmarks (includes TTFB metrics)workflow with stream💻 Local Development
▲ Production (Vercel)
stream pipeline with 5 transform steps (1MB)💻 Local Development
▲ Production (Vercel)
10 parallel streams (1MB each)💻 Local Development
▲ Production (Vercel)
fan-out fan-in 10 streams (1MB each)💻 Local Development
▲ Production (Vercel)
SummaryFastest Framework by WorldWinner determined by most benchmark wins
Fastest World by FrameworkWinner determined by most benchmark wins
Column Definitions
Worlds:
❌ Some benchmark jobs failed:
Check the workflow run for details. |
Contributor
🧪 E2E Test Results✅ All tests passed Summary
Details by Category✅ ▲ Vercel Production
✅ 💻 Local Development
✅ 📦 Local Production
✅ 🐘 Local Postgres
✅ 🪟 Windows
✅ 📋 Other
|
The barrier threaded into the end-of-run drain gates every write the suspension handler performs, including wait_created. Add explicit coverage mirroring the hook case: a fire-and-forget `void sleep(...)` that completes synchronously must not write wait_created before the backgrounded run_started lands. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Turbo mode backgrounds
run_startedand runs the workflow body optimistically before it is durable, gating every downstream write on a run-ready barrier so nothing reaches the server before the run exists. The awaited-hook path is already gated (the suspension handler threads the barrier).But a workflow that creates a fire-and-forget hook (or wait/attribute) and then returns synchronously never suspends. Its
*_createdevent is instead committed by the end-of-run drain insiderunWorkflow— which runs before the runtime's terminalawaitRunReady(). In turbo, that write could race ahead of the still-in-flightrun_startedand hit the server before the run is created.Fix
Thread the run-ready barrier through
runWorkflowintodrainPendingQueueItems→handleSuspension, so the drain's writes are gated exactly like the normal suspension path. No-op outside turbo (barrierundefined), and a barrier rejection is swallowed for ordering only (a genuinely failedrun_startedsurfaces via the subsequent write).Test
Added regression tests in
workflow.test.tsthat run a fire-and-forget-hook and a fire-and-forget-wait (void sleep(...)) workflow throughrunWorkflowwith a pending barrier and assert (via a fixed-time race, robust to VM-setup latency) thatrunWorkflowcannot complete and thehook_created/wait_createdwrite is withheld until the barrier resolves. Both verified to fail without the threading.attr_setis gated by the same path. Also covers the barrier-rejection (writes anyway) and no-barrier (non-turbo, writes immediately) cases.🤖 Generated with Claude Code