Skip to content

perf: Replace dict-based request ID management with list-indexed structure #808

@mykaul

Description

@mykaul

Summary

The per-connection request ID management uses deque(range(N)) for the free ID pool and dict{int → (cb, decoder, result_metadata)} for in-flight request tracking. Replacing these with list-based structures yields measurable per-request savings and significant memory reduction.

Current Architecture (connection.py)

Structure Type Purpose
request_ids deque(range(300)) Pool of available stream IDs
_requests dict{int → tuple} Maps in-flight stream IDs to callbacks
orphaned_request_ids set() Timed-out stream IDs awaiting late responses

Proposed Change

  • request_ids: dequelist (used as stack with pop()/append())
  • _requests: dictlist[tuple|None] (indexed by stream ID, None = free)
  • orphaned_request_ids: keep as set() (rarely used)

Benchmark Results (CPython 3.14)

Full request cycle (get ID + store request + retrieve request + return ID):

Approach ns/op Memory (300 IDs)
Current (deque + dict) 72.3 ns ~20.6 KB
Proposed (list + list) 47.1 ns ~4.9 KB
Saving 25.2 ns (35%) ~15.7 KB (76%)

Per-operation breakdown (dict vs list for _requests):

Operation dict list Saving
Store request 43.7 ns 13.1 ns 30.1 ns
Retrieve request 43.2 ns 13.1 ns 30.1 ns

Key Implementation Concerns

  1. error_all_requests() (connection.py:1143): Currently does requests = self._requests; self._requests = {} (atomic swap). With a list, this becomes swap + allocate new [None]*size, or iterate+clear.
  2. _requests.pop(stream_id) with KeyError (connection.py:1407, cluster.py:4509): Needs conversion to if lst[stream_id] is None check.
  3. requests.popitem()[1] (connection.py:1162): Needs to find a non-None entry — requires iteration or count tracking.
  4. not self._requests truthiness check (asyncorereactor.py:458): Needs a separate _requests_count tracker or any() call.
  5. Dynamic sizing: Start at 300, grow list when highest_request_id exceeds current size (matching existing deque growth pattern).

Files to Modify

  • cassandra/connection.py (~15 lines)
  • cassandra/cluster.py (~3 lines)
  • cassandra/io/asyncorereactor.py (~1 line)
  • Tests that inspect _requests as a dict

Impact

Related: #536

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions