You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Analysis of LWT (Lightweight Transaction) performance with prepared statements, identification of bottlenecks, and a plan for improvements across the full execution pipeline.
Background
LWT queries in this driver are always prepared statements — the is_lwt flag is set during the PREPARE response via the ScyllaDB protocol extension SCYLLA_LWT_ADD_METADATA_MARK (protocol.py:789-791). The prepared statement execution pipeline for LWT shares the same stages as any prepared statement but with one critical difference:
LWT prepared statements have result_metadata = None (query.py:508, test_prepared_statements.py:623). This is because LWT results have variable column sets:
1. LWT Disables Result Metadata Caching (Fundamental Protocol Limitation — NOT FIXABLE)
LWT prepared statements have result_metadata = None, so skip_meta=bool(prepared_statement.result_metadata) evaluates to False (cluster.py:2956). Every LWT execution includes full metadata from the server, and the driver re-parses it every time.
Investigation conclusion: This is a correct, intentional protocol-level design decision, not a driver-side oversight. The server sets NO_METADATA_FLAG (0x0004) in the PREPARE response because LWT result schemas are non-deterministic at prepare time. See B1 analysis below for the full explanation.
2. namedtuple Class Creation Per Response (High Impact)
named_tuple_factory (the default row factory) calls Python's namedtuple() which internally uses exec() — on every response. For single-row LWT results, this is the dominant per-response cost.
3. ParseDesc Reconstruction Per Response (High Impact)
ParseDesc construction (column name extraction, ColDesc namedtuple creation, deserializer lookup) runs on every result set. For LWT's small 1-row results, this is the dominant decode cost.
Cache both result metadata variants (applied vs. not-applied) keyed by column count.
Status: Investigated and ruled out. The result_metadata = None / NO_METADATA_FLAG behavior is a correct protocol-level design decision by the Cassandra/ScyllaDB server, not a driver-side oversight or performance gap that can be closed.
Why LWT result schemas cannot be cached:
A single prepared LWT statement can produce three or more distinct result schemas across its lifetime:
Scenario
Columns Returned
Applied (row inserted)
[applied] — 1 column
Not applied (row exists)
[applied], a, b, d — 4 columns
Not applied (after ALTER TABLE ADD c)
[applied], a, b, c, d — 5 columns
This is verified by the integration test _test_updated_conditional (tests/integration/standard/test_prepared_statements.py:595-634, PYTHON-847).
A dual-schema caching approach fails because:
ALTER TABLE invalidates the not-applied schema. Adding/dropping columns changes the column count and types. A cached schema would silently produce wrong deserialization after DDL changes.
The schema space is unbounded. Over the lifetime of a prepared statement, there are N+1 possible schemas (1 applied + N historical table schema versions after DDL changes).
Column count is ambiguous. Different schema versions could produce the same column count after enough ALTER TABLE operations.
Different LWT types produce different not-applied shapes.INSERT IF NOT EXISTS returns all columns; UPDATE IF col = val returns only the IF-clause columns; UPDATE IF EXISTS never has additional columns.
The server has already made this decision. The NO_METADATA_FLAG is set server-side in the PREPARE response. The driver's skip_meta=bool(result_metadata) simply mirrors this. Changing this would require a new CQL protocol extension in ScyllaDB itself.
History: ScyllaDB issue scylladb/scylladb#6259 (2020, kostja) addressed what was feasible — adding the is_lwt flag to PREPARE responses for routing purposes. Metadata caching was deliberately out of scope because the fundamental variability problem cannot be solved driver-side.
Driver git history:
2016 (PYTHON-71): result_metadata and skip_meta optimization introduced. LWT returned [] for result_metadata.
2017 (PYTHON-847): Integration tests added proving result_metadata stays empty and result_metadata_id does NOT change even after ALTER TABLE.
2019: Refactored from [] to None (early return in recv_results_metadata()). Test updated to assert assertIsNone(prepared_statement.result_metadata).
2025: ScyllaDB LWT protocol extension (SCYLLA_LWT_ADD_METADATA_MARK) added for is_lwt flag detection.
Benchmark results (inconclusive): Microbenchmarks on a noisy system (5 runs, pinned to single CPU, 5000+ rounds each) showed no statistically significant difference. The optimization saves an isinstance() + regex match() call (~tens of nanoseconds), which is lost in the noise of property access and row factory overhead (~3400ns median). The value is in code clarity and avoiding unnecessary regex for the common non-batch LWT path.
B5. Pre-allocate values List in BoundStatement.bind() (Low Impact, Low Complexity)
Use [None] * len(values) with index assignment instead of repeated append().
Status: Completed. PR mykaul/python-driver#12. No new tests needed (existing 37 bind tests cover all paths).
Benchmark results (inconclusive): Microbenchmarks showed no measurable improvement. The bind() cost is dominated by serialize() calls per value, not list operations. Results across multiple runs were within noise (±3-4%). The optimization is essentially neutral — the enumerate(zip(...)) wrapper may offset the savings from avoiding append().
Expected Combined Impact
For a typical LWT prepared statement execution (single-row result, default named_tuple_factory):
The decode phase improvements are the most impactful for LWT because LWT results are small (1 row), making fixed per-response costs (ParseDesc construction, namedtuple class creation) the dominant overhead rather than per-row processing.
Note: Bottleneck #1 (LWT disables result metadata caching) is a fundamental CQL protocol limitation that cannot be addressed driver-side. The per-response metadata overhead is the inherent cost of LWT's variable result schemas. The existing PRs (#740, #741, #742) minimize the cost of processing that metadata once received.
Summary
Analysis of LWT (Lightweight Transaction) performance with prepared statements, identification of bottlenecks, and a plan for improvements across the full execution pipeline.
Background
LWT queries in this driver are always prepared statements — the
is_lwtflag is set during the PREPARE response via the ScyllaDB protocol extensionSCYLLA_LWT_ADD_METADATA_MARK(protocol.py:789-791). The prepared statement execution pipeline for LWT shares the same stages as any prepared statement but with one critical difference:LWT prepared statements have
result_metadata = None(query.py:508,test_prepared_statements.py:623). This is because LWT results have variable column sets:applied=True→ returns([applied])(1 column)applied=False→ returns([applied], col1, col2, ...)(N columns with existing values)This means the
skip_metaoptimization (cluster.py:2956) is disabled for LWT — the server must send full result metadata with every response.How LWT Queries Flow Through the Driver
Detection (Server-Side, ScyllaDB Extension)
SCYLLA_LWT_ADD_METADATA_MARKwith aLWT_OPTIMIZATION_META_BIT_MASK=<bitmask>value_LwtInfo(lwt_meta_bit_mask)stored inProtocolFeatures.lwt_info(protocol_features.py:84-96)lwt_info.get_lwt_flag(flags)checks if the LWT bit is set (protocol.py:791)PreparedStatement._is_lwt(query.py:514,539)Flag Propagation
Statementquery.py:378-379FalsePreparedStatementquery.py:514,539,650-651_is_lwtfrom serverBoundStatementquery.py:912-913prepared_statement.is_lwt()BatchStatementquery.py:989,1085-1108,1144-1145Trueif any added statement is LWTImpact on Routing and Retry
policies.py:517): LWT queries disable replica shuffling for Paxos consensus orderingpolicies.py:960-965): CAS write timeouts are rethrown (not retried)policies.py:1108-1110): Serial consistency reads are not downgradedResult Handling
ResultSet.was_applied(cluster.py:5323-5351): Checks[applied]column from LWT resultsLWTException/check_applied()(cqlengine/query.py:45-78): CQLEngine ORM integrationIdentified Performance Bottlenecks
1. LWT Disables Result Metadata Caching (Fundamental Protocol Limitation — NOT FIXABLE)
LWT prepared statements have
result_metadata = None, soskip_meta=bool(prepared_statement.result_metadata)evaluates toFalse(cluster.py:2956). Every LWT execution includes full metadata from the server, and the driver re-parses it every time.Investigation conclusion: This is a correct, intentional protocol-level design decision, not a driver-side oversight. The server sets
NO_METADATA_FLAG(0x0004) in the PREPARE response because LWT result schemas are non-deterministic at prepare time. See B1 analysis below for the full explanation.2. namedtuple Class Creation Per Response (High Impact)
named_tuple_factory(the default row factory) calls Python'snamedtuple()which internally usesexec()— on every response. For single-row LWT results, this is the dominant per-response cost.3. ParseDesc Reconstruction Per Response (High Impact)
ParseDescconstruction (column name extraction,ColDescnamedtuple creation, deserializer lookup) runs on every result set. For LWT's small 1-row results, this is the dominant decode cost.4. Per-Value Column Encryption Policy Check (Medium Impact)
column_encryption_policyis checked per-value in both bind and decode paths, even when no encryption policy is set (99%+ of deployments).5. Limited Cython Serializer Coverage (Medium Impact)
Only
FloatType,DoubleType,Int32Type, andVectorTypehave Cython serializers. All other types fall through toGenericSerializer.6. Hardcoded Timeouts (Low Impact)
cluster.py:2296-2297)cluster.py:4520)7. WeakValueDictionary for Prepared Statement Cache (Low Impact)
If the user discards all references to a
PreparedStatement, it gets GC'd and needs re-preparation — a hidden extra round-trip (cluster.py:1448).Existing PRs Addressing the Pipeline
bind()__slots__exec()called per response is 135x overheadRecommended PR Landing Order
Phase 1 — Land existing PRs (priority order for LWT impact):
__slots__) — no dependenciesNew Work: LWT-Specific Optimizations
B1. LWT Result Metadata Caching (NOT FEASIBLE)
Cache both result metadata variants (applied vs. not-applied) keyed by column count.Status: Investigated and ruled out. The
result_metadata = None/NO_METADATA_FLAGbehavior is a correct protocol-level design decision by the Cassandra/ScyllaDB server, not a driver-side oversight or performance gap that can be closed.Why LWT result schemas cannot be cached:
A single prepared LWT statement can produce three or more distinct result schemas across its lifetime:
[applied]— 1 column[applied], a, b, d— 4 columnsALTER TABLE ADD c)[applied], a, b, c, d— 5 columnsThis is verified by the integration test
_test_updated_conditional(tests/integration/standard/test_prepared_statements.py:595-634, PYTHON-847).A dual-schema caching approach fails because:
ALTER TABLEinvalidates the not-applied schema. Adding/dropping columns changes the column count and types. A cached schema would silently produce wrong deserialization after DDL changes.INSERT IF NOT EXISTSreturns all columns;UPDATE IF col = valreturns only the IF-clause columns;UPDATE IF EXISTSnever has additional columns.NO_METADATA_FLAGis set server-side in the PREPARE response. The driver'sskip_meta=bool(result_metadata)simply mirrors this. Changing this would require a new CQL protocol extension in ScyllaDB itself.History: ScyllaDB issue scylladb/scylladb#6259 (2020, kostja) addressed what was feasible — adding the
is_lwtflag to PREPARE responses for routing purposes. Metadata caching was deliberately out of scope because the fundamental variability problem cannot be solved driver-side.Driver git history:
result_metadataandskip_metaoptimization introduced. LWT returned[]for result_metadata.result_metadatastays empty andresult_metadata_iddoes NOT change even afterALTER TABLE.[]toNone(early return inrecv_results_metadata()). Test updated to assertassertIsNone(prepared_statement.result_metadata).SCYLLA_LWT_ADD_METADATA_MARK) added foris_lwtflag detection.B2. LWT-Aware Retry Policy (Medium Impact, Low Complexity)
Create configurable CAS retry behavior — allow retrying CAS timeouts on the same coordinator with backoff.
Status: Not yet started.
B3. LWT Performance Benchmark Suite (DONE)
Create benchmarks for: LWT bind, LWT decode (applied/not-applied), LWT end-to-end throughput, ParseDesc cache hit rate for LWT, and prepared SELECT vs. prepared LWT comparison.
Status: Completed. Branch
perf/lwt-benchmarkson mykaul remote. 67 pytest-benchmark tests covering bind, decode,was_applied, and comparison benchmarks.B4. Optimize
was_appliedFast Path (Low Impact, Low Complexity)Use
is_lwt()to skip batch detection regex inResultSet.was_applied.Status: Completed. PR mykaul/python-driver#13. Unit tests added.
Benchmark results (inconclusive): Microbenchmarks on a noisy system (5 runs, pinned to single CPU, 5000+ rounds each) showed no statistically significant difference. The optimization saves an
isinstance()+ regexmatch()call (~tens of nanoseconds), which is lost in the noise of property access and row factory overhead (~3400ns median). The value is in code clarity and avoiding unnecessary regex for the common non-batch LWT path.B5. Pre-allocate
valuesList inBoundStatement.bind()(Low Impact, Low Complexity)Use
[None] * len(values)with index assignment instead of repeatedappend().Status: Completed. PR mykaul/python-driver#12. No new tests needed (existing 37 bind tests cover all paths).
Benchmark results (inconclusive): Microbenchmarks showed no measurable improvement. The
bind()cost is dominated byserialize()calls per value, not list operations. Results across multiple runs were within noise (±3-4%). The optimization is essentially neutral — theenumerate(zip(...))wrapper may offset the savings from avoidingappend().Expected Combined Impact
For a typical LWT prepared statement execution (single-row result, default
named_tuple_factory):The decode phase improvements are the most impactful for LWT because LWT results are small (1 row), making fixed per-response costs (ParseDesc construction, namedtuple class creation) the dominant overhead rather than per-row processing.
Note: Bottleneck #1 (LWT disables result metadata caching) is a fundamental CQL protocol limitation that cannot be addressed driver-side. The per-response metadata overhead is the inherent cost of LWT's variable result schemas. The existing PRs (#740, #741, #742) minimize the cost of processing that metadata once received.