fix(db): nested toArray includes drop children when sibling groups share a correlation key (#1501)#1607
Conversation
…#1501) Adds a failing test reproducing TanStack#1501: with three collection levels (products -> priceRanges -> region), when two priceRanges in different parent groups share the same deepest correlation key (regionId === 1), one of the two nested `region` arrays comes back empty. The nested pipeline buffer is shared by reference across per-parent-group states (createPerEntryIncludesStates) and drainNestedBuffers deletes a buffer entry after routing it to the first matching parent group, so the sibling that drains second finds nothing. Note: the minimal repro in the issue does not trigger the bug as written (its dummy `eq(p.id, _.id)` correlation against a single-row anchor with findOne collapses to one product, so the two overlapping siblings never coexist in the output). This test puts both sibling groups in the result so the collision actually occurs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughNested include routing now tracks multiple parents per deepest correlation key, replays drained nested rows from snapshots, and adds regression coverage for shared-key three-level includes across preload, insert, delete, live collection, and materialized variants. ChangesNested toArray fan-out and snapshotting
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
More templates
@tanstack/angular-db
@tanstack/browser-db-sqlite-persistence
@tanstack/capacitor-db-sqlite-persistence
@tanstack/cloudflare-durable-objects-db-sqlite-persistence
@tanstack/db
@tanstack/db-ivm
@tanstack/db-sqlite-persistence-core
@tanstack/electric-db-collection
@tanstack/electron-db-sqlite-persistence
@tanstack/expo-db-sqlite-persistence
@tanstack/node-db-sqlite-persistence
@tanstack/offline-transactions
@tanstack/powersync-db-collection
@tanstack/query-db-collection
@tanstack/react-db
@tanstack/react-native-db-sqlite-persistence
@tanstack/rxdb-db-collection
@tanstack/solid-db
@tanstack/svelte-db
@tanstack/tauri-db-sqlite-persistence
@tanstack/trailbase-db-collection
@tanstack/vue-db
commit: |
…ation key (TanStack#1501) With 3+ levels of nested toArray includes, when two children in different parent groups shared the same deepest correlation key, only one received the nested rows and the other came back empty. Two compounding causes: - nestedRoutingIndex mapped each nested correlation key to a single parent group (last-writer-wins), and the shared buffer entry was deleted after routing to the first match, so sibling groups sharing the key were dropped. - the nested pipeline does not re-emit already-materialized rows, so a parent group that starts referencing an existing correlation key after the rows were drained (e.g. a sibling inserted after the initial load) saw nothing. Fixes: - nestedRoutingIndex now maps a nested correlation key to a Set of parent groups; drainNestedBuffers fans buffered grandchild changes out to every ready parent group before dropping the buffer entry. - a per-level cumulative snapshot of net-present grandchild rows seeds late-arriving parent groups from what their siblings already received. Routing-index inserts/deletes and parent-delete cleanup are updated to maintain the per-key parent sets. Adds tests covering initial load, a sibling inserted after load, and deleting one of two siblings that share a correlation key. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
packages/db/tests/query/includes.test.ts (1)
5078-5079: 🩺 Stability & Availability | 🔵 TrivialReplace the fixed 50ms sleeps with a deterministic wait helper.
flushPromises()is already available in the test utilities, and the same change should be applied to the matching delete case at line 5160.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/db/tests/query/includes.test.ts` around lines 5078 - 5079, The tests in `includes.test.ts` use fixed 50ms sleeps after `priceRanges.insert(...)`, which should be replaced with the deterministic `flushPromises()` helper already available in the test utilities. Update the insert case and the matching delete case to await `flushPromises()` instead of `setTimeout`, keeping the same behavior while removing timing flakiness.Source: Coding guidelines
packages/db/src/query/live/collection-config-builder.ts (1)
1153-1158: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAvoid adding new
anyto snapshot state.
SnapshotRow.valueand the new helper signature can useunknownhere; the code only stores/replays the value and does not need unchecked access. As per coding guidelines,**/*.{ts,tsx}: Avoid usinganytypes; useunknowninstead when the type is truly unknown.♻️ Proposed type tightening
type SnapshotRow = { - value: any + value: unknown orderByIndex: string | undefined /** Net multiplicity (inserts − deletes) currently materialized for this row */ count: number } @@ function accumulateSnapshot( setup: NestedIncludesSetup, nestedCorrelationKey: unknown, - childChanges: Map<unknown, Changes<any>>, + childChanges: Map<unknown, Changes<unknown>>, ): void {Also applies to: 1373-1377
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/db/src/query/live/collection-config-builder.ts` around lines 1153 - 1158, The snapshot state is introducing an unnecessary any type in SnapshotRow.value and the related helper signature, even though the value is only stored and replayed. Tighten the types in collection-config-builder by changing SnapshotRow and the affected helper used around the snapshot logic to use unknown instead of any, and keep the existing handling unchanged except for type-safe narrowing where needed.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/db/src/query/live/collection-config-builder.ts`:
- Around line 1387-1399: Clone snapshot values before assigning them into row
state in collection-config-builder’s snapshot/replay logic, because row.value is
currently sharing the same object as changes.value and later mutation of
changes.value can strip routing metadata. Update the row initialization and
replay paths in the affected update/merge flow (including the shared logic
around the count/orderByIndex handling) to store a cloned copy instead of the
original object, so later cleanup cannot affect previously buffered rows and
nested routing can be rebuilt correctly.
---
Nitpick comments:
In `@packages/db/src/query/live/collection-config-builder.ts`:
- Around line 1153-1158: The snapshot state is introducing an unnecessary any
type in SnapshotRow.value and the related helper signature, even though the
value is only stored and replayed. Tighten the types in
collection-config-builder by changing SnapshotRow and the affected helper used
around the snapshot logic to use unknown instead of any, and keep the existing
handling unchanged except for type-safe narrowing where needed.
In `@packages/db/tests/query/includes.test.ts`:
- Around line 5078-5079: The tests in `includes.test.ts` use fixed 50ms sleeps
after `priceRanges.insert(...)`, which should be replaced with the deterministic
`flushPromises()` helper already available in the test utilities. Update the
insert case and the matching delete case to await `flushPromises()` instead of
`setTimeout`, keeping the same behavior while removing timing flakiness.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ace73dfd-754c-45cb-8106-836a50331bd6
📒 Files selected for processing (3)
.changeset/nested-toarray-shared-buffer-overlap.mdpackages/db/src/query/live/collection-config-builder.tspackages/db/tests/query/includes.test.ts
✅ Files skipped from review due to trivial changes (1)
- .changeset/nested-toarray-shared-buffer-overlap.md
| row = { | ||
| value: changes.value, | ||
| orderByIndex: changes.orderByIndex, | ||
| count: 0, | ||
| } | ||
| snap.set(childKey, row) | ||
| } | ||
| row.count += changes.inserts - changes.deletes | ||
| if (changes.inserts > 0) { | ||
| row.value = changes.value | ||
| if (changes.orderByIndex !== undefined) { | ||
| row.orderByIndex = changes.orderByIndex | ||
| } |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Clone snapshot values before storing or replaying them.
row.value currently aliases changes.value; later cleanup deletes INCLUDES_ROUTING from changes.value at Line 1984. A late replay can then seed a row without routing metadata, preventing deeper nested routing from being rebuilt via Lines 1544-1547.
🐛 Proposed fix
+function cloneSnapshotValue<T>(value: T): T {
+ if (value == null || typeof value !== `object`) return value
+ if (Array.isArray(value)) return [...value] as T
+ return { ...(value as Record<PropertyKey, unknown>) } as T
+}
+
function accumulateSnapshot(
setup: NestedIncludesSetup,
nestedCorrelationKey: unknown,
childChanges: Map<unknown, Changes<any>>,
@@
row = {
- value: changes.value,
+ value: cloneSnapshotValue(changes.value),
orderByIndex: changes.orderByIndex,
count: 0,
}
@@
if (changes.inserts > 0) {
- row.value = changes.value
+ row.value = cloneSnapshotValue(changes.value)
if (changes.orderByIndex !== undefined) {
row.orderByIndex = changes.orderByIndex
}
@@
byChild.set(childKey, {
deletes: 0,
inserts: row.count,
- value: row.value,
+ value: cloneSnapshotValue(row.value),
orderByIndex: row.orderByIndex,
})Also applies to: 1436-1443
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/db/src/query/live/collection-config-builder.ts` around lines 1387 -
1399, Clone snapshot values before assigning them into row state in
collection-config-builder’s snapshot/replay logic, because row.value is
currently sharing the same object as changes.value and later mutation of
changes.value can strip routing metadata. Update the row initialization and
replay paths in the affected update/merge flow (including the shared logic
around the count/orderByIndex handling) to store a cloned copy instead of the
original object, so later cleanup cannot affect previously buffered rows and
nested routing can be rebuilt correctly.
|
Here's a complete explanation of the bugs and the fix. This PR implements the snapshot accumulation fix but it means we store each distinct nested row in that snapshot. The explanation below also documents a potential alternative. The exampleFor each product you want its price ranges, and for each price range its region. Three levels deep. Our problem data:
How includes actually compute (the important part)You might imagine the engine runs the "region" query fresh for each price range. It doesn't. That would be slow and wouldn't update reactively. Instead: There is ONE shared "region" pipeline. It runs once over all regions and emits region rows tagged with a correlation key — the value that says "who do I belong to". For region, that key is Just one entry. Even though two price ranges want region 1, the pipeline produces it once, keyed by Results land in a "buffer" — a temporary mailbox: Then a "drain" step delivers buffered rows to the right parents. This is the routing step. To deliver region 1 to the correct price range, the engine keeps a routing index: Each price-range group has its own private copy of region results (its own little child collection), so the final tree can show each one its own Why it brokeThe whole design quietly assumed one nested key belongs to one parent. Two things follow from that assumption, and both are wrong when two siblings share Bug 1 — the routing index only remembered one parent. It's a plain key→value map, so the second write clobbers the first. Now Bug 2 — late arrivals get nothing. Say T-Shirt's priceRange exists at load, but Hoodie's priceRange 3 is inserted later. By then the region pipeline already emitted region 1 (at load), it was drained, and the buffer entry was deleted. When priceRange 3 shows up, the pipeline does not re-emit region 1 — from its point of view nothing about regions changed; it already produced that row. So the buffer is empty and there's nothing to deliver to Hoodie. Empty array again. Both of these were confirmed with debug logging before writing the fix. The fix — two matching piecesFix 1: let one nested key map to many parentsThe routing index value changes from a single parent to a Set of parents:
Fix 2: remember delivered rows so late arrivals can be seededBecause the pipeline won't re-emit region 1 for a price range added later, the engine needs its own memory of "what region rows currently exist for regionId 1". That's the new snapshot:
One-line summaryThe nested-includes system assumed each nested key had exactly one parent and treated delivered rows as throwaway. The fix makes a nested key able to feed multiple parents (Set instead of single value + fan-out on drain), and keeps a snapshot of currently-existing rows so parents that show up late can be handed the same data their siblings already received. Is the snapshot a memory problem?Partly a fair concern, but it's bounded, not unbounded:
So it's not a leak and not multiplied by parents. But it is a real added cost: roughly O(distinct nested rows) extra — essentially one more index over the deepest level, for the lifetime of the live query. For a query with a very large nested collection, that's non-trivial memory. The alternative: copy from an existing sibling instead of keeping a snapshotThe key realizationWhen does a late-arriving parent need seeding at all? Only when the region rows were already delivered and the buffer entry deleted in an earlier flush, and the pipeline won't re-emit them. But "already delivered to whom?" — to a sibling. So whenever seeding is needed, at least one sibling already has the rows materialized in its own private copy. That's the insight the alternative exploits: you don't need a separate stash of the data, because a sibling is already holding a perfectly good copy of it. Go ask the sibling. How "copy from sibling" worksWalk through Hoodie's priceRange 3 arriving late, referencing
So instead of consulting a snapshot, step 2 reaches into a sibling's already-materialized collection. What it costs you in codeThe snapshot stored exactly what the delivery machinery needs: for each row, its
then iterate the sibling's collection and reconstruct insert-changes from each row. It's the same idea as You also need a small guard: "pick a sibling that is actually populated right now." The routing-index Set gives you candidates, but you'd skip the newcomer itself and any sibling that happens to be empty, and copy from the first good one. (As argued above, a good one is essentially guaranteed to exist whenever seeding is needed — if none did, the buffer path would still be handling it.) The two approaches compared
How to chooseThe snapshot trades memory for simplicity and predictability: one clean structure, consulted in one place, no "go find a healthy donor" logic. Copy-from-sibling trades that memory away but pays in code complexity and a reconstruction step on each late insert. For typical includes (modest nested result sets) the snapshot's overhead is negligible and the simpler code wins. If you expect very large nested collections where doubling an index matters, copy-from-sibling is the better fit — and it's a localized swap: only |
…lize includes The nested-includes routing fix is independent of how each level is materialized. Add regression tests proving sibling parent groups that share a deepest correlation key resolve their grandchildren when the nested levels are left as live Collections (no wrapper) and when wrapped with materialize(), mirroring the existing toArray coverage. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/db/tests/query/includes.test.ts`:
- Around line 5171-5261: The current live/materialize regression tests only
verify preload behavior; extend the existing nested-collection coverage in the
relevant `it(...)` cases to assert post-load mutations as well. Use the same
`createLiveQueryCollection`, `toTree`, and nested `q.from(...)` setup to add
insert/delete assertions for the live and materialized variants, and include
async stale-order/race scenarios that exercise the same shared-correlation-key
path after initial load. Keep the new assertions anchored to the existing
collection setup so the bug is reproduced across both materialization modes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e827313e-ce5c-41ab-a147-be0bec3dfe49
📒 Files selected for processing (1)
packages/db/tests/query/includes.test.ts
| // The shared-correlation-key routing is independent of how each level is | ||
| // materialized, so the same guarantee must hold when the nested levels are | ||
| // left as live Collections (no toArray/materialize wrapper). | ||
| it(`resolves nested grandchildren for sibling groups when levels stay Collections`, async () => { | ||
| type Product = { id: number; title: string } | ||
| type PriceRange = { id: number; productId: number; regionId: number } | ||
| type Region = { id: number; name: string } | ||
|
|
||
| const products = createCollection( | ||
| localOnlyCollectionOptions<Product>({ | ||
| id: `shared-corr-collection-products`, | ||
| getKey: (p) => p.id, | ||
| initialData: [ | ||
| { id: 1, title: `T-Shirt` }, | ||
| { id: 2, title: `Hoodie` }, | ||
| ], | ||
| }), | ||
| ) | ||
| const priceRanges = createCollection( | ||
| localOnlyCollectionOptions<PriceRange>({ | ||
| id: `shared-corr-collection-price-ranges`, | ||
| getKey: (r) => r.id, | ||
| initialData: [ | ||
| { id: 1, productId: 1, regionId: 1 }, | ||
| { id: 2, productId: 1, regionId: 2 }, | ||
| { id: 3, productId: 2, regionId: 1 }, | ||
| ], | ||
| }), | ||
| ) | ||
| const regions = createCollection( | ||
| localOnlyCollectionOptions<Region>({ | ||
| id: `shared-corr-collection-regions`, | ||
| getKey: (r) => r.id, | ||
| initialData: [ | ||
| { id: 1, name: `Europe` }, | ||
| { id: 2, name: `North America` }, | ||
| ], | ||
| }), | ||
| ) | ||
|
|
||
| await Promise.all([ | ||
| products.preload(), | ||
| priceRanges.preload(), | ||
| regions.preload(), | ||
| ]) | ||
|
|
||
| const collection = createLiveQueryCollection({ | ||
| id: `shared-corr-collection-live`, | ||
| query: (q) => | ||
| q.from({ p: products }).select(({ p }) => ({ | ||
| id: p.id, | ||
| title: p.title, | ||
| priceRanges: q | ||
| .from({ pr: priceRanges }) | ||
| .where(({ pr }) => eq(pr.productId, p.id)) | ||
| .select(({ pr }) => ({ | ||
| id: pr.id, | ||
| regionId: pr.regionId, | ||
| region: q | ||
| .from({ r: regions }) | ||
| .where(({ r }) => eq(r.id, pr.regionId)) | ||
| .select(({ r }) => ({ id: r.id, name: r.name })), | ||
| })), | ||
| })), | ||
| }) | ||
| await collection.preload() | ||
|
|
||
| // toTree recursively unwraps the nested live Collections into arrays. | ||
| expect(toTree(collection)).toEqual([ | ||
| { | ||
| id: 1, | ||
| title: `T-Shirt`, | ||
| priceRanges: [ | ||
| { id: 1, regionId: 1, region: [{ id: 1, name: `Europe` }] }, | ||
| { | ||
| id: 2, | ||
| regionId: 2, | ||
| region: [{ id: 2, name: `North America` }], | ||
| }, | ||
| ], | ||
| }, | ||
| { | ||
| id: 2, | ||
| title: `Hoodie`, | ||
| priceRanges: [ | ||
| { id: 3, regionId: 1, region: [{ id: 1, name: `Europe` }] }, | ||
| ], | ||
| }, | ||
| ]) | ||
| }) | ||
|
|
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Expand live/materialize variants with mutation-path assertions.
These two tests validate preload only. Please add post-load insert/delete coverage (and async stale-order/race coverage) for these same materialization modes to protect the full regression surface.
As per coding guidelines, **/*.test.{ts,tsx,js}: “Always add unit tests that reproduce a bug before fixing it…” and “Test corner cases including … async race conditions …”.
Also applies to: 5262-5353
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/db/tests/query/includes.test.ts` around lines 5171 - 5261, The
current live/materialize regression tests only verify preload behavior; extend
the existing nested-collection coverage in the relevant `it(...)` cases to
assert post-load mutations as well. Use the same `createLiveQueryCollection`,
`toTree`, and nested `q.from(...)` setup to add insert/delete assertions for the
live and materialized variants, and include async stale-order/race scenarios
that exercise the same shared-correlation-key path after initial load. Keep the
new assertions anchored to the existing collection setup so the bug is
reproduced across both materialization modes.
Source: Coding guidelines
|
Had a quick discussion with @samwillis on the 2 alternatives. We decided to go for correctness and simplicity so we pick the snapshot approach. |
Fixes #1501.
The bug
With three (or more) levels of nested
toArrayincludes (e.g.products → priceRanges → region), when two children in different parent groups share the same deepest correlation key, only one of them receives the nested rows — the other comes back as an empty array:Root cause
The nested pipeline writes results into a buffer shared across per-parent-group states, and routing assumes one parent per nested correlation key. Two compounding issues:
nestedRoutingIndexmapped each nested correlation key to a single parent group (last-writer-wins), anddrainNestedBuffersdeleted the buffer entry after routing to the first match — so a sibling group sharing the key was dropped.Fix
nestedRoutingIndexnow maps a nested correlation key to a Set of parent groups;drainNestedBuffersfans buffered grandchild changes out to every ready parent group before dropping the buffer entry. Routing-index inserts/deletes and parent-delete cleanup maintain the per-key parent sets.Tests
Three tests under
nested toArray includes (depth 3+)cover initial load, a sibling inserted after load, and deleting one of two siblings that share a correlation key. Fulltests/query/suite (1596 tests) passes.🤖 Generated with Claude Code
Summary by CodeRabbit
toArrayincludes where sibling parent groups sharing the same deepest correlation key could produce incorrect or missing nested results.