Skip to content

Add stop/cancel mechanism for SSE streaming generation #4823

@ferponse

Description

@ferponse

Problem

When using runner.run_async() with StreamingMode.SSE, there is no way for consumers to signal the flow to stop generating mid-stream. This is needed for implementing a "stop generating" button in chat UIs.

Current workarounds

  • Task cancellation (asyncio.Task.cancel()): Works but is a hard stop — may not clean up properly and doesn't allow the flow to return gracefully.
  • Breaking out of async for: Triggers aclose() on the generator chain, but the LLM may continue generating in the background until the connection is closed.

Neither approach gives the flow a chance to stop cleanly between chunks.

Proposed solution

Add an optional stop_event: asyncio.Event parameter to runner.run_async() that:

  • Is checked before each LLM call in the while True loop of run_async
  • Is checked before yielding each streaming chunk in _run_one_step_async
  • When set (stop_event.set()), causes the flow to stop yielding and return cleanly

Usage

stop_event = asyncio.Event()

# Start streaming in a task
async def stream():
    async for event in runner.run_async(
        user_id=user_id,
        session_id=session_id,
        new_message=content,
        run_config=run_config,
        stop_event=stop_event,
    ):
        yield event

# When user clicks "stop generating":
stop_event.set()

Use case

We build a chat UI that streams responses via WebSocket. While the agent is generating, we show a "stop" button. When clicked, the frontend sends a stop message, and the backend sets the stop_event to immediately stop the LLM from producing more tokens.

Metadata

Metadata

Assignees

No one assigned

    Labels

    live[Component] This issue is related to live, voice and video chat

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions