Skip to content

Reuse OpenAI clients to improve throughput #71

@luojiyin1987

Description

@luojiyin1987

Problem

Currently, every call to ChatGPT_API, ChatGPT_API_with_finish_reason, and ChatGPT_API_async creates a new OpenAI/AsyncOpenAI client instance. This has significant overhead:

  • HTTP connection pool is created on each call
  • Reduces throughput in high-concurrency scenarios
  • The async version uses async with which creates and closes a client on every request

Solution

  • Add module-level singleton clients cached per API key
  • Replace client creation with _get_sync_client/_get_async_client helper functions
  • Fix chat_history mutation side effect (use list concatenation instead of append)
  • Fix error return type inconsistency in ChatGPT_API_with_finish_reason
  • Add missing re import

Impact

  • Improved throughput for batch API calls
  • Reduced connection overhead
  • Better resource utilization

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions