feat: add ledger cache layer for receipt store#2788
Conversation
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2788 +/- ##
==========================================
- Coverage 57.12% 57.09% -0.03%
==========================================
Files 2088 2090 +2
Lines 171079 171169 +90
==========================================
+ Hits 97726 97734 +8
- Misses 64680 64735 +55
- Partials 8673 8700 +27
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| } | ||
|
|
||
| type cacheWarmupProvider interface { | ||
| warmupReceipts() []ReceiptRecord |
There was a problem hiding this comment.
Does warm up means we will load some receipts into cache during initialization?
There was a problem hiding this comment.
yes it is used for parquet to replay from WAL
| s.cacheNextRotate = blockNumber + s.cacheRotateInterval | ||
| return | ||
| } | ||
| for blockNumber >= s.cacheNextRotate { |
There was a problem hiding this comment.
Why not use blockNumber % interval == 0?
There was a problem hiding this comment.
we could miss the rotation if the blockNumber % interval == 0 block does not have EVM receipts for some reason.
| receipt, found := blockReceipts[txHash] | ||
| if found { | ||
| // Callers (e.g. RPC response formatting) may normalize TransactionIndex in-place. | ||
| // Clone to avoid mutating the cached receipt and corrupting future lookups. |
There was a problem hiding this comment.
For perf reason, do we want to default to zero copy? As long as we make sure the codebase doesn't modify the receipt after calling get, we should be able to enable zeroCopy
There was a problem hiding this comment.
I think there are other places we do modify the receipt actually for example:
there are multiple mutations in evmrpc/tx.go:
Lines 154-165 (for failed txs that used 0 gas):
receipt.From = from.Hex()
receipt.To = etx.To().Hex()
receipt.ContractAddress = ""
receipt.TxType = uint32(etx.Type())
receipt.Status = uint32(ethtypes.ReceiptStatusFailed)
receipt.GasUsed = 0
Line 456 (tx index normalization):
receipt.TransactionIndex = uint32(evmTxIndex)
There was a problem hiding this comment.
Hmm that's bad, is this the only place we modify receipts? Can we actually fix those places where we modify receipts directly and make that a cloned copy before we modify?
There was a problem hiding this comment.
fixed. moved the clone receipt functionality out of ledger_db and to the modification locations
## Summary This PR adds a parquet-based receipt storage backend with DuckDB for efficient range queries on logs, enabling fast `eth_getLogs` queries across block ranges. - Add parquet backend option (`Backend: "parquet"` in config) - Parquet files rotate every 500 blocks - DuckDB queries across closed parquet files for efficient log filtering - WAL for crash recovery of in-progress parquet files - Pruning of old parquet files based on `KeepRecent` config - Build tag support: use `-tags duckdb` to enable parquet backend The parquet backend supports the new `FilterLogs` range query API introduced in #2788, enabling efficient cross-block log queries without falling back to per-receipt fetching. ## Dependencies - Depends on #2788 (ledger cache layer) ## Test plan - Receipt store unit tests pass (without duckdb tag) - Parquet store tests pass with `-tags duckdb` - Integration testing with full node using parquet backend
## Summary This PR introduces an in-memory cache layer on top of the pebbledb receipt store backend, improving `GetReceipt` performance for recently accessed receipts. - Add `cachedReceiptStore` wrapper that caches receipts in rotating chunks - Simplify `FilterLogs` API: change from per-block signature to range-based `(fromBlock, toBlock, crit)` - Pebble backend's `FilterLogs` returns `ErrRangeQueryNotSupported`, signaling callers to fetch receipts individually - Update `evmrpc/filter.go` with fallback logic for backends that don't support range queries - Simplify `ReceiptStoreConfig` by removing unused pebble options The cache rotates every 500 blocks (configurable) and keeps 3 chunks, providing a sliding window of recent receipts for fast lookups. ## Test plan - [x] Receipt store unit tests pass - [x] Cached receipt store tests verify cache hits - [x] FilterLogs returns `ErrRangeQueryNotSupported` for pebble backend
## Summary This PR adds a parquet-based receipt storage backend with DuckDB for efficient range queries on logs, enabling fast `eth_getLogs` queries across block ranges. - Add parquet backend option (`Backend: "parquet"` in config) - Parquet files rotate every 500 blocks - DuckDB queries across closed parquet files for efficient log filtering - WAL for crash recovery of in-progress parquet files - Pruning of old parquet files based on `KeepRecent` config - Build tag support: use `-tags duckdb` to enable parquet backend The parquet backend supports the new `FilterLogs` range query API introduced in #2788, enabling efficient cross-block log queries without falling back to per-receipt fetching. ## Dependencies - Depends on #2788 (ledger cache layer) ## Test plan - Receipt store unit tests pass (without duckdb tag) - Parquet store tests pass with `-tags duckdb` - Integration testing with full node using parquet backend
This PR introduces an in-memory cache layer on top of the pebbledb receipt store backend, improving `GetReceipt` performance for recently accessed receipts. - Add `cachedReceiptStore` wrapper that caches receipts in rotating chunks - Simplify `FilterLogs` API: change from per-block signature to range-based `(fromBlock, toBlock, crit)` - Pebble backend's `FilterLogs` returns `ErrRangeQueryNotSupported`, signaling callers to fetch receipts individually - Update `evmrpc/filter.go` with fallback logic for backends that don't support range queries - Simplify `ReceiptStoreConfig` by removing unused pebble options The cache rotates every 500 blocks (configurable) and keeps 3 chunks, providing a sliding window of recent receipts for fast lookups. - [x] Receipt store unit tests pass - [x] Cached receipt store tests verify cache hits - [x] FilterLogs returns `ErrRangeQueryNotSupported` for pebble backend
This PR adds a parquet-based receipt storage backend with DuckDB for efficient range queries on logs, enabling fast `eth_getLogs` queries across block ranges. - Add parquet backend option (`Backend: "parquet"` in config) - Parquet files rotate every 500 blocks - DuckDB queries across closed parquet files for efficient log filtering - WAL for crash recovery of in-progress parquet files - Pruning of old parquet files based on `KeepRecent` config - Build tag support: use `-tags duckdb` to enable parquet backend The parquet backend supports the new `FilterLogs` range query API introduced in #2788, enabling efficient cross-block log queries without falling back to per-receipt fetching. - Depends on #2788 (ledger cache layer) - Receipt store unit tests pass (without duckdb tag) - Parquet store tests pass with `-tags duckdb` - Integration testing with full node using parquet backend
Summary
This PR introduces an in-memory cache layer on top of the pebbledb receipt store backend, improving
GetReceiptperformance for recently accessed receipts.cachedReceiptStorewrapper that caches receipts in rotating chunksFilterLogsAPI: change from per-block signature to range-based(fromBlock, toBlock, crit)FilterLogsreturnsErrRangeQueryNotSupported, signaling callers to fetch receipts individuallyevmrpc/filter.gowith fallback logic for backends that don't support range queriesReceiptStoreConfigby removing unused pebble optionsThe cache rotates every 500 blocks (configurable) and keeps 3 chunks, providing a sliding window of recent receipts for fast lookups.
Test plan
ErrRangeQueryNotSupportedfor pebble backend