perf(store): lazy CacheMultiStore — defer cachekv alloc until access by pdrobnjak · Pull Request #2806 · sei-protocol/sei-chain

pdrobnjak · 2026-02-05T15:15:45Z

Summary

Defer cachekv.NewStore allocation in CacheMultiStore until the store is actually accessed via GetKVStore/GetStore/GetGigaKVStore
Store parent references lazily in storeParents/gigaStoreParents maps; materialize cachekv wrappers on-demand via ensureStore()/ensureGigaStore()
Eliminates ~85% of cachekv allocations per block: OCC's prepareTask discards them immediately, and EVM txs only touch ~3 of 21 modules

Problem

After lazy-init sortedCache (#2804), the next largest allocator was cachekv.NewStore itself — still called eagerly for all ~21 module stores per CacheMultiStore, even though EVM transactions only access ~3 modules. OCC's prepareTask creates a CMS then immediately replaces stores with VersionIndexedStores, discarding the cachekv wrappers. Pure waste.

Profiling after #2804 (30s, M4 Max, 1000 EVM transfers/block):

Allocator	alloc_space	alloc_objects
`cachekv.NewStore`	27,330 MB	top allocator
`cachekv.Write` (re-creates MemDB)	12,438 MB	—
`cachemulti.newCacheMultiStoreFromCMS`	43,516 MB cum	—
`memclrNoHeapPointers`	#1 self CPU (16.90s)	—

Changes

sei-cosmos/store/cachemulti/store.go:

newCacheMultiStoreFromCMS: Store parent KVStore references in storeParents / gigaStoreParents maps instead of eagerly wrapping with cachekv.NewStore
ensureStore() / ensureGigaStore(): Lazy materializers that create cachekv wrapper on first access
GetKVStore / GetStore / GetGigaKVStore: Call ensureStore() / ensureGigaStore() before returning

Benchmark Results (M4 Max, 1000 EVM transfers/block, 30s profile)

Metric	Before (after #2804)	After	Delta
TPS (steady-state range)	8,000–8,400	8,000–8,800	+~5%
`cachekv.NewStore` alloc	27,330 MB	6,716 MB	-20,614 MB
`cachekv.Write` alloc	12,438 MB	4,380 MB	-8,058 MB
`memclrNoHeapPointers` CPU	16.90s (#1 self)	6.96s	-9.94s
`mallocgc` CPU (cum)	—	—	-7.23s
`runtime.(*mspan).heapBitsSmallForAddr` CPU	4.38s	2.10s	-2.28s
Total CPU samples (30s)	98.96s	132.73s	+34% more useful work

Note: total CPU samples increase because workers complete faster → more goroutines actively executing instead of blocked on allocator contention.

`pprof -top -diff_base` highlights (alloc_space)

+33,659 MB  cachemulti.newStoreWithoutGiga  (lazy path replaces eager path)
+26,591 MB  btree.NewFreeListG              (shifted to lazy-init from #2804)
-20,614 MB  cachekv.NewStore                (direct savings from this PR)
+13,602 MB  gaskv.NewStore                  (still allocated eagerly — addressed in #2808)

`pprof -top -diff_base` highlights (CPU)

+10.26s  runtime.usleep              (workers idle faster — less contention)
 -9.94s  runtime.memclrNoHeapPointers (fewer large allocs to zero)
 +6.14s  syscall.syscall             (more EVM execution reaching CGO)
 -7.23s  runtime.mallocgc cum        (less allocation overall)
 -2.28s  runtime.heapBitsSmallForAddr (less heap bookkeeping)

Test plan

All 14 giga tests pass (cd giga/tests && go test ./... -count=1)
All 3 cachemulti tests pass (cd sei-cosmos/store/cachemulti && go test ./... -count=1)
gofmt -s -l clean
pprof diff confirms allocation reduction
CI passes

🤖 Generated with Claude Code

…l access CacheMultiStore previously created cachekv.NewStore wrappers eagerly for all ~21 module stores on every snapshot. This was wasteful because: 1. The OCC scheduler's prepareTask creates a CMS then immediately replaces all stores with VersionIndexedStores via SetKVStores — the cachekv stores allocated in CMS were discarded unused. 2. EVM transactions only access ~3 of 21 modules, so ~85% of cachekv stores per statedb snapshot were never touched. This change defers cachekv.NewStore allocation by storing parent references in storeParents/gigaStoreParents maps and materializing cachekv wrappers lazily on first GetKVStore/GetStore/GetGigaKVStore access. Benchmark results (M4 Max, 1000 EVM transfers/block): - TPS: 8,000-8,400 → 8,000-8,800 (~5% median uplift) - cachekv.NewStore allocations: -20,614 MB (-360M objects) - cachekv.Write allocations: -8,058 MB - memclrNoHeapPointers CPU: -9.94s - mallocgc CPU: -7.23s cumulative Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

github-actions · 2026-02-05T15:16:47Z

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`✅ passed`	`✅ passed`	`✅ passed`	Feb 6, 2026, 5:20 PM

codecov · 2026-02-05T15:19:13Z

Codecov Report

❌ Patch coverage is 49.42529% with 44 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.28%. Comparing base (fd2e28d) to head (d52d4a1).
⚠️ Report is 5 commits behind head on perf/lazy-init-sorted-cache.

Files with missing lines	Patch %	Lines
sei-cosmos/store/cachemulti/store.go	49.42%	42 Missing and 2 partials ⚠️

Additional details and impacted files

@@                       Coverage Diff                       @@
##           perf/lazy-init-sorted-cache    #2806      +/-   ##
===============================================================
+ Coverage                        46.91%   52.28%   +5.37%     
===============================================================
  Files                             1965     1030     -935     
  Lines                           160604    85199   -75405     
===============================================================
- Hits                             75341    44546   -30795     
+ Misses                           78734    36523   -42211     
+ Partials                          6529     4130    -2399

Flag	Coverage Δ
sei-chain	`?`
sei-cosmos	`48.13% <49.42%> (-0.01%)`	⬇️
sei-db	`68.72% <ø> (ø)`
sei-tendermint	`58.08% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
sei-cosmos/store/cachemulti/store.go	`48.33% <49.42%> (-22.15%)`	⬇️

... and 1444 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.