-
Notifications
You must be signed in to change notification settings - Fork 487
[persist] Shift all heartbeat / frontier management for the read handle to a background task #34590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
5e66761 to
080ec06
Compare
This is somewhere between Tokio's `watch` and a normal mutex with a condvar. Kind of annoying to have to write one's own library for this, but everything else I tried was either awkward or needed an unbounded channel...
These two are much more tightly related than the task is to the read handle, and this allows for some additional privacy for shared state.
Observe new seqnos in both the handle and the heartbeat thread Always downgrade to the shared seqno Should be pretty recent! Use the main downgrade-since method in the background thread Do all downgrades in the background task Shift expiry to the background task Move the heartbeat task to read.rs This is a bit more read-handle business than machine business, in my opinion! Comment fixup
|
Deployed to staging. Overall metrics look good - the rate of downgrades is less spiky and they see fewer conflicts. One negative metric change is that the heartbeat task wakes up substantially more often. I believe this is because frequent downgrading of the since will trigger the background task to wake, even if there's nothing to do yet. I think I could rework the code to avoid this, though it's a bit tedious, and it's only a mild drawback, so I will wait to be asked first. :) |
| self.since = new_since.clone(); | ||
| self.hold_state.modify(|s| { | ||
| s.downgrade_since(new_since); | ||
| s.request_sync = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could probably peek the outstanding seqno here to decide if you wanted to request a wakeup... but I'm not too worried
DAlperin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very exciting, thank you!
Originally, Persist was built with the assumption that callers would reliably call the
ReadHandlemethods (likemaybe_downgrade_since) with some reasonable frequency - say, at least every few minutes. This assumption turns out not to be true for a variety of reasons... and hasn't even been true for Persist's own internal use ofReadHandle! (For example, calls tosnapshotcan block indefinitely, so you'd need to add timeouts to those calls and call them in a loop with a downgrade operation, like we do in this recent PR. Not a nice pattern!) We shifted heartbeating to a background thread, which helped quite a bit with lease timeouts, but didn't entirely solve the issue - failing to downgrade regularly could cause unnecessary seqno holds, for example.This PR shifts all interaction with the reader lease to the heartbeat task, which is now responsible for downgrading both the since and seqno holds, and manages the final expiry of the state. Communication between the handle and the task is via some shared state, which is managed with an abstraction similar to Persist's
Applierfor shard-level state. This ensures that the state is updated regularly, and with the most recent possible information, even when whomever holds the handle is busy with other stuff.Motivation
https://github.com/MaterializeInc/incidents-and-escalations/issues/224
A core issue with this stuff is that we can't downgrade our seqno hold when nobody is interacting with the client (downgrading frontiers etc.) and in practice it's difficult to ensure that every handle is interacted with promptly. The background heartbeat task was a partial solution for this, which handled only extending leases but not releasing any unnecessarily held resources. Automatically downgrading the seqno hold without requiring the caller to do anything proactively should help avoid most cases of leaked seqnos.
This is a more general solution than the recent #34458, which times out certain long-running operations so we can run downgrades etc. With this PR, the downgrades will run in the background heartbeat task no matter what long-running operations prevents progress in the frontend.
Tips for reviewer
After this PR, the only way to have a long-running seqno hold should be if a caller has leased a part and not returned it. I'm optimistic that that case is rare - it involves more obviously-wrong client behaviour, where the case this PR fixes is pretty subtle - and anyways there's not much we could do about that in Persist internals, since holding back the seqno is needed for correctness.