You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When tablets are enabled, LWT queries lose their natural token-ring replica ordering, meaning the Paxos leader is not prioritized. This adds an extra network hop and increases latency for every LWT query routed through the tablet path.
Root Cause
In TokenAwarePolicy.make_query_plan() (cassandra/policies.py:509-513), the tablet code path constructs the replica list from the child policy's query plan order rather than the natural token-ring order:
The child policy's make_query_plan() yields hosts in round-robin order (starting from a rotating position). Filtering this by tablet membership preserves the round-robin order, not the token-ring order. The Paxos leader (first natural replica) can end up at any position in the list.
Even though LWT queries skip the shuffle(replicas) call at line 517-518, the ordering is already wrong because it came from the child policy, not from the token map.
Problem
When tablets are enabled, LWT queries lose their natural token-ring replica ordering, meaning the Paxos leader is not prioritized. This adds an extra network hop and increases latency for every LWT query routed through the tablet path.
Root Cause
In
TokenAwarePolicy.make_query_plan()(cassandra/policies.py:509-513), the tablet code path constructs the replica list from the child policy's query plan order rather than the natural token-ring order:The child policy's
make_query_plan()yields hosts in round-robin order (starting from a rotating position). Filtering this by tablet membership preserves the round-robin order, not the token-ring order. The Paxos leader (first natural replica) can end up at any position in the list.Even though LWT queries skip the
shuffle(replicas)call at line 517-518, the ordering is already wrong because it came from the child policy, not from the token map.Impact
RackAwareRoundRobinPolicy)tablet.replicasfield contains the correct replica order, but it's only used as a set for membership testing, discarding the ordering informationProposed Fix
For LWT queries on the tablet path, use the order from
tablet.replicasdirectly instead of the child policy's round-robin order:The LWT path should also bypass
yield_in_order()as described in #780.Related
RackAwareRoundRobinPolicydemotes Paxos leader for LWT (non-tablet path)