(improvement) Eliminate extra BytesIO allocation in encode_message compression path (80-300ns savings, 17-32%) #800
Conversation
Benchmark results (CPython 3.14, 500k iterations)
Applies to protocol v4 and below where compression is at the message level. Eliminates one |
…n path When compression is active (protocol v4 and below), encode_message previously created two BytesIO objects: one for the uncompressed body, then another to write the header + compressed body. The second BytesIO is unnecessary -- use direct bytes concatenation (header + compressed_body) instead, which avoids one BytesIO allocation per compressed message. The non-compression path already used a single BytesIO and is unchanged.
f3f51d3 to
2b4e88b
Compare
Follow-up commit: inline has_checksumming_support checkCommit: ChangeReplaced Semantics are identical: Benchmark results (
|
e3572ae to
04e4ba5
Compare
… overhead Replace ProtocolVersion.has_checksumming_support(protocol_version) calls in encode_message and decode_message with inline integer comparisons using pre-computed module-level constants. This avoids the classmethod dispatch overhead on every encode/decode call. Benchmark: classmethod call: 61.3 ns inline compare: 25.5 ns saving: 35.8 ns/call (2.4x)
04e4ba5 to
859ddfe
Compare
Summary
Unify the two
BytesIOallocations inencode_messageinto a singlebuff = io.BytesIO()declared once before the compression branch. In the compression path (protocol v4 and below), the body is written intobuff, extracted viagetvalue(), compressed, and returned via direct bytes concatenation (header + compressed_body). The non-compression path reuses the samebuffwith the existing seek-based header reservation pattern.Before (master): 2
BytesIO()allocations on the compression path (buff+body), 1 on the non-compression path.After: 1
BytesIO()allocation on both paths.Motivation
For protocol v4 and below, compression happens at the message level (v5+ uses segment-level compression with checksumming). The original code created a separate
body = io.BytesIO()for the uncompressed payload, then copied the result into a secondbuff = io.BytesIO()before writing the header. This second buffer is unnecessary — we can write the body intobuffdirectly, extract it, compress, and concatenate the header as bytes.Benchmark (CPython 3.14, per-call, two runs on quiet machine)
Applies to every compressed message on protocol v4 and below.
Changes
cassandra/protocol.py: Movebuff = io.BytesIO()before theifbranch. In the compression path, write body intobuffinstead of a separatebody = io.BytesIO(), extract viagetvalue(), compress, and returnheader + bodyvia bytes concat. Non-compression path uses the samebuffwithseek(9)header reservation as before.Testing
Unit tests pass (645 passed, 43 skipped). The non-compression path is structurally unchanged.