Conversation
- Change pruneSnapshots() to run asynchronously (go db.pruneSnapshots()) - Replace pruneSnapshotLock mutex with atomic.Bool using CompareAndSwap - Add logging for prune start/completion with duration - Update Close() to wait for ongoing prune before shutdown - Update test to use Eventually with pruningInProgress.Load()
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 353c42d82f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
sei-db/state_db/sc/memiavl/db.go
Outdated
| // CAS: only one prune can run at a time | ||
| if !db.pruningInProgress.CompareAndSwap(false, true) { | ||
| db.logger.Info("pruneSnapshots skipped, previous prune still in progress") | ||
| return |
There was a problem hiding this comment.
Ensure prune requests aren’t dropped
The new CAS guard returns early when a prune is already running. That means a snapshot rewrite that finishes while a prune is in progress will log “skipped” and never schedule a follow‑up prune. Because pruneSnapshots() bases its retention on currentVersion at start, any snapshots created after it begins won’t be considered, leaving more than snapshotKeepRecent snapshots on disk until the next rewrite happens. If no further rewrite occurs, stale snapshots are never pruned (regression versus the previous mutex, which queued prunes). Consider queuing a rerun (e.g., loop until no more pending work) or scheduling another prune after the current one completes.
Useful? React with 👍 / 👎.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2814 +/- ##
==========================================
+ Coverage 46.95% 56.61% +9.65%
==========================================
Files 1967 2033 +66
Lines 160924 166263 +5339
==========================================
+ Hits 75567 94131 +18564
+ Misses 78790 63864 -14926
- Partials 6567 8268 +1701
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
sei-db/state_db/sc/memiavl/db.go
Outdated
| db.pruneSnapshotLock.Lock() | ||
| defer db.pruneSnapshotLock.Unlock() | ||
| // CAS: only one prune can run at a time | ||
| if !db.pruningInProgress.CompareAndSwap(false, true) { |
There was a problem hiding this comment.
this works, but fwiw, there's also a TryLock option (fyi):
if !db.pruningMu.TryLock() {
db.logger.Info("pruneSnapshots skipped, previous prune still in progress")
return
}
defer db.pruningMu.Unlock()
There was a problem hiding this comment.
this seems even simpler than cas way
sei-db/state_db/sc/memiavl/db.go
Outdated
| db.logger.Info("pruneSnapshots started") | ||
| startTime := time.Now() | ||
| defer func() { | ||
| db.logger.Info("pruneSnapshots completed", "duration", fmt.Sprintf("%.2fs", time.Since(startTime).Seconds())) |
sei-db/state_db/sc/memiavl/db.go
Outdated
| } | ||
| db.closed = true | ||
| // Wait for any ongoing prune to finish, then block new prunes | ||
| for !db.pruningInProgress.CompareAndSwap(false, true) { |
There was a problem hiding this comment.
and if we used a mutex, this could be
db.pruningMu.Lock() // blocks until pruning finishes, no spinning needed
db.pruningMu.Unlock()
There was a problem hiding this comment.
will take this approach
stevenlanders
left a comment
There was a problem hiding this comment.
added some comments but lgtm
|
Created backport PR for
Please cherry-pick the changes locally and resolve any conflicts. git fetch origin backport-2814-to-release/v6.3
git worktree add --checkout .worktree/backport-2814-to-release/v6.3 backport-2814-to-release/v6.3
cd .worktree/backport-2814-to-release/v6.3
git reset --hard HEAD^
git cherry-pick -x 5c26262c6e31fb0f25a8f26a59de8cbad2ca43d4
git push --force-with-lease |
…2814) - Run pruneSnapshots() asynchronously (go db.pruneSnapshots()) to avoid blocking Commit - Use TryLock instead of Lock so overlapping prunes are skipped rather than queued - Add logging for prune start/skip/completion with duration - Add idempotent Close() guard to prevent double-close - Update test to use Eventually on directory state (cherry picked from commit 5c26262)
…2814) - Run pruneSnapshots() asynchronously (go db.pruneSnapshots()) to avoid blocking Commit - Use TryLock instead of Lock so overlapping prunes are skipped rather than queued - Add logging for prune start/skip/completion with duration - Add idempotent Close() guard to prevent double-close - Update test to use Eventually on directory state (cherry picked from commit 5c26262)
Backport of #2814 to `release/v6.3`. Co-authored-by: yirenz <blindchaser@users.noreply.github.com>
…2814) - Run pruneSnapshots() asynchronously (go db.pruneSnapshots()) to avoid blocking Commit - Use TryLock instead of Lock so overlapping prunes are skipped rather than queued - Add logging for prune start/skip/completion with duration - Add idempotent Close() guard to prevent double-close - Update test to use Eventually on directory state ## Describe your changes and provide context ## Testing performed to validate your change
Describe your changes and provide context
Testing performed to validate your change