perf(store): lazy CacheMultiStore — defer cachekv alloc until access#2806
perf(store): lazy CacheMultiStore — defer cachekv alloc until access#2806pdrobnjak wants to merge 3 commits intoperf/lazy-init-sorted-cachefrom
Conversation
…l access CacheMultiStore previously created cachekv.NewStore wrappers eagerly for all ~21 module stores on every snapshot. This was wasteful because: 1. The OCC scheduler's prepareTask creates a CMS then immediately replaces all stores with VersionIndexedStores via SetKVStores — the cachekv stores allocated in CMS were discarded unused. 2. EVM transactions only access ~3 of 21 modules, so ~85% of cachekv stores per statedb snapshot were never touched. This change defers cachekv.NewStore allocation by storing parent references in storeParents/gigaStoreParents maps and materializing cachekv wrappers lazily on first GetKVStore/GetStore/GetGigaKVStore access. Benchmark results (M4 Max, 1000 EVM transfers/block): - TPS: 8,000-8,400 → 8,000-8,800 (~5% median uplift) - cachekv.NewStore allocations: -20,614 MB (-360M objects) - cachekv.Write allocations: -8,058 MB - memclrNoHeapPointers CPU: -9.94s - mallocgc CPU: -7.23s cumulative Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## perf/lazy-init-sorted-cache #2806 +/- ##
===============================================================
+ Coverage 46.91% 52.28% +5.37%
===============================================================
Files 1965 1030 -935
Lines 160604 85199 -75405
===============================================================
- Hits 75341 44546 -30795
+ Misses 78734 36523 -42211
+ Partials 6529 4130 -2399
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| for k, parent := range cms.storeParents { | ||
| if _, exists := cms.stores[k]; !exists { | ||
| cms.stores[k] = handler(k, parent.(types.KVStore)) | ||
| } | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k := range cms.storeParents { | ||
| delete(cms.storeParents, k) | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k, parent := range cms.gigaStoreParents { | ||
| if _, exists := cms.gigaStores[k]; !exists { | ||
| cms.gigaStores[k] = handler(k, parent) | ||
| } | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k := range cms.gigaStoreParents { | ||
| delete(cms.gigaStoreParents, k) | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
When CacheMultiStore branches (CacheContext, CacheTxContext), lazy parents were passed directly to the child CMS. This caused child.Write() to write directly to the raw commit store, bypassing the parent's caching layer — leading to state leaks (e.g., Simulate leaking account sequence increments to deliverState). Fix: force-materialize all lazy parents on the parent CMS before creating the child branch. This preserves proper cachekv isolation while keeping lazy init for the OCC hot path (SetKVStores). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The lazy ensureStore/ensureGigaStore methods write to the stores and storeParents maps on first access. When multiple goroutines share a CMS (e.g., slashing BeginBlocker's HandleValidatorSignatureConcurrent), concurrent calls to GetKVStore race on these map operations. Add *sync.RWMutex with double-checked locking: the fast path (store already materialized) takes only an RLock; the slow path (first access) takes a full Lock. After all stores are materialized, every call hits the RLock-only fast path with ~10-20ns overhead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| for k := range cms.storeParents { | ||
| cms.ensureStoreLocked(k) | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k := range cms.gigaStoreParents { | ||
| cms.ensureGigaStoreLocked(k) | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
Summary
cachekv.NewStoreallocation inCacheMultiStoreuntil the store is actually accessed viaGetKVStore/GetStore/GetGigaKVStorestoreParents/gigaStoreParentsmaps; materialize cachekv wrappers on-demand viaensureStore()/ensureGigaStore()prepareTaskdiscards them immediately, and EVM txs only touch ~3 of 21 modulesProblem
After lazy-init sortedCache (#2804), the next largest allocator was
cachekv.NewStoreitself — still called eagerly for all ~21 module stores perCacheMultiStore, even though EVM transactions only access ~3 modules. OCC'sprepareTaskcreates a CMS then immediately replaces stores with VersionIndexedStores, discarding the cachekv wrappers. Pure waste.Profiling after #2804 (30s, M4 Max, 1000 EVM transfers/block):
cachekv.NewStorecachekv.Write(re-creates MemDB)cachemulti.newCacheMultiStoreFromCMSmemclrNoHeapPointersChanges
sei-cosmos/store/cachemulti/store.go:newCacheMultiStoreFromCMS: Store parent KVStore references instoreParents/gigaStoreParentsmaps instead of eagerly wrapping withcachekv.NewStoreensureStore()/ensureGigaStore(): Lazy materializers that create cachekv wrapper on first accessGetKVStore/GetStore/GetGigaKVStore: CallensureStore()/ensureGigaStore()before returningBenchmark Results (M4 Max, 1000 EVM transfers/block, 30s profile)
cachekv.NewStorealloccachekv.WriteallocmemclrNoHeapPointersCPUmallocgcCPU (cum)runtime.(*mspan).heapBitsSmallForAddrCPUNote: total CPU samples increase because workers complete faster → more goroutines actively executing instead of blocked on allocator contention.
pprof -top -diff_basehighlights (alloc_space)pprof -top -diff_basehighlights (CPU)Test plan
cd giga/tests && go test ./... -count=1)cd sei-cosmos/store/cachemulti && go test ./... -count=1)gofmt -s -lclean🤖 Generated with Claude Code