-
Notifications
You must be signed in to change notification settings - Fork 645
Open
Description
Problem
Currently, every call to ChatGPT_API, ChatGPT_API_with_finish_reason, and ChatGPT_API_async creates a new OpenAI/AsyncOpenAI client instance. This has significant overhead:
- HTTP connection pool is created on each call
- Reduces throughput in high-concurrency scenarios
- The async version uses
async withwhich creates and closes a client on every request
Solution
- Add module-level singleton clients cached per API key
- Replace client creation with
_get_sync_client/_get_async_clienthelper functions - Fix
chat_historymutation side effect (use list concatenation instead of append) - Fix error return type inconsistency in
ChatGPT_API_with_finish_reason - Add missing
reimport
Impact
- Improved throughput for batch API calls
- Reduced connection overhead
- Better resource utilization
Metadata
Metadata
Assignees
Labels
No labels