Closed
Conversation
This PR adds support for streaming predictions via the `replicate.stream()` method. Changes: - Add `stream()` method to both Replicate and AsyncReplicate clients - Add module-level `stream()` function for convenience - Create new `lib/_predictions_stream.py` module with streaming logic - Add comprehensive tests for sync and async streaming - Update README with documentation and examples using anthropic/claude-4-sonnet The stream method creates a prediction and returns an iterator that yields output chunks as they become available via Server-Sent Events (SSE). This is useful for language models where you want to display output as it's generated.
DP-671 Add support for `replicate.stream()`
The legacy 1.x client supports a method called When creating a prediction via APi, the returned prediction object will always have a Docs about streaming are here: https://replicate.com/docs/topics/predictions/streaming Tasks:
|
The API uses Server-Sent Events internally, but the Python client yields plain string chunks to the user, not SSE event objects.
Collaborator
|
Some thoughts:
|
Member
Author
Member
Author
What effect does that have? The function still works, but a user also sees that error message if they try to run it? Is there a proper way to not implement it at all, but display a helpful message when users call |
Member
Author
Member
Author
|
Replaced by #79 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds support for streaming predictions via the
replicate.stream()method, as specified in DP-671This change is intended to support feature parity with the legacy pre-Stainless 1.x client.
Changes
stream()method to bothReplicateandAsyncReplicateclientsstream()function for conveniencelib/_predictions_stream.pymodule with streaming logicanthropic/claude-4-sonnetThe
stream()method creates a prediction and returns an iterator that yields output chunks as strings as they become available from the streaming API. This is useful for language models where you want to display output as it's generated rather than waiting for the entire response.Example Usage
Testing locally
Clone the repo and checkout the branch:
gh repo clone replicate/replicate-python-stainless cd replicate-python-stainless gh pr checkout 75Set up the development environment:
Run the tests:
Try the example:
Prompts
Related: DP-671