Summary
The per-connection request ID management uses deque(range(N)) for the free ID pool and dict{int → (cb, decoder, result_metadata)} for in-flight request tracking. Replacing these with list-based structures yields measurable per-request savings and significant memory reduction.
Current Architecture (connection.py)
| Structure |
Type |
Purpose |
request_ids |
deque(range(300)) |
Pool of available stream IDs |
_requests |
dict{int → tuple} |
Maps in-flight stream IDs to callbacks |
orphaned_request_ids |
set() |
Timed-out stream IDs awaiting late responses |
Proposed Change
request_ids: deque → list (used as stack with pop()/append())
_requests: dict → list[tuple|None] (indexed by stream ID, None = free)
orphaned_request_ids: keep as set() (rarely used)
Benchmark Results (CPython 3.14)
Full request cycle (get ID + store request + retrieve request + return ID):
| Approach |
ns/op |
Memory (300 IDs) |
| Current (deque + dict) |
72.3 ns |
~20.6 KB |
| Proposed (list + list) |
47.1 ns |
~4.9 KB |
| Saving |
25.2 ns (35%) |
~15.7 KB (76%) |
Per-operation breakdown (dict vs list for _requests):
| Operation |
dict |
list |
Saving |
| Store request |
43.7 ns |
13.1 ns |
30.1 ns |
| Retrieve request |
43.2 ns |
13.1 ns |
30.1 ns |
Key Implementation Concerns
error_all_requests() (connection.py:1143): Currently does requests = self._requests; self._requests = {} (atomic swap). With a list, this becomes swap + allocate new [None]*size, or iterate+clear.
_requests.pop(stream_id) with KeyError (connection.py:1407, cluster.py:4509): Needs conversion to if lst[stream_id] is None check.
requests.popitem()[1] (connection.py:1162): Needs to find a non-None entry — requires iteration or count tracking.
not self._requests truthiness check (asyncorereactor.py:458): Needs a separate _requests_count tracker or any() call.
- Dynamic sizing: Start at 300, grow list when
highest_request_id exceeds current size (matching existing deque growth pattern).
Files to Modify
cassandra/connection.py (~15 lines)
cassandra/cluster.py (~3 lines)
cassandra/io/asyncorereactor.py (~1 line)
- Tests that inspect
_requests as a dict
Impact
Related: #536
Summary
The per-connection request ID management uses
deque(range(N))for the free ID pool anddict{int → (cb, decoder, result_metadata)}for in-flight request tracking. Replacing these with list-based structures yields measurable per-request savings and significant memory reduction.Current Architecture (
connection.py)request_idsdeque(range(300))_requestsdict{int → tuple}orphaned_request_idsset()Proposed Change
request_ids:deque→list(used as stack withpop()/append())_requests:dict→list[tuple|None](indexed by stream ID,None= free)orphaned_request_ids: keep asset()(rarely used)Benchmark Results (CPython 3.14)
Full request cycle (get ID + store request + retrieve request + return ID):
Per-operation breakdown (dict vs list for
_requests):Key Implementation Concerns
error_all_requests()(connection.py:1143): Currently doesrequests = self._requests; self._requests = {}(atomic swap). With a list, this becomes swap + allocate new[None]*size, or iterate+clear._requests.pop(stream_id)withKeyError(connection.py:1407,cluster.py:4509): Needs conversion toif lst[stream_id] is Nonecheck.requests.popitem()[1](connection.py:1162): Needs to find a non-Noneentry — requires iteration or count tracking.not self._requeststruthiness check (asyncorereactor.py:458): Needs a separate_requests_counttracker orany()call.highest_request_idexceeds current size (matching existing deque growth pattern).Files to Modify
cassandra/connection.py(~15 lines)cassandra/cluster.py(~3 lines)cassandra/io/asyncorereactor.py(~1 line)_requestsas a dictImpact
Related: #536