Skip to content

Make the API async where needed, avoid block_on #926

Draft
tnull wants to merge 35 commits into
lightningdevkit:mainfrom
tnull:2026-06-async-kvstore-persistence-async-api
Draft

Make the API async where needed, avoid block_on #926
tnull wants to merge 35 commits into
lightningdevkit:mainfrom
tnull:2026-06-async-kvstore-persistence-async-api

Conversation

@tnull

@tnull tnull commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Based on #919.

WIP

tnull added 30 commits June 5, 2026 10:35
Use a temporary rust-lightning fork revision that exposes async
migratable KV-store support. This lets ldk-node migrate filesystem
stores without reimplementing LDK's migration logic locally.

Co-Authored-By: HAL 9000
Read and write BDK wallet state through async KVStore helpers while
keeping the current WalletPersister entry points bridged through the
node runtime. This reduces the wallet persistence surface that still
depends on KVStoreSync.

Co-Authored-By: HAL 9000
Static invoice persistence already runs from async handlers, so use
KVStore directly instead of routing those reads and writes through
the blocking KVStoreSync trait.

Co-Authored-By: HAL 9000
Persist peer store updates through async KVStore operations. The
synchronous node APIs keep bridging at their runtime boundary while
async event handling awaits peer persistence directly.

Co-Authored-By: HAL 9000
Document that synchronous peer readers do not wait for the async mutation guard and can observe changes before persistence catches up.

Co-Authored-By: HAL 9000
Persist DataStore mutations through async KVStore operations while
keeping the existing synchronous APIs bridged through the node
runtime. Async event handling now awaits payment store writes
directly.

Co-Authored-By: HAL 9000
Document that synchronous DataStore readers do not wait for the async mutation guard and can observe transient state while persistence catches up.

Co-Authored-By: HAL 9000
Persist node metric updates through async KVStore writes and await
them from the chain, gossip, and scoring tasks. This removes the
remaining blocking metrics writer while keeping the helper name
stable.

Co-Authored-By: HAL 9000
Avoid introducing a temporary macro when moving node metrics persistence onto async KV storage.

Co-Authored-By: HAL 9000
Document that synchronous node metrics readers do not wait for the async mutation guard and can observe metrics while persistence catches up.

Co-Authored-By: HAL 9000
Persist the on-chain wallet through BDK's AsyncWalletPersister so
wallet state writes use the async KVStore path. Existing synchronous
wallet APIs keep bridging through the node runtime until their
callers are made async.

Co-Authored-By: HAL 9000
Open filesystem stores through the async LDK migration helper so
v1-to-v2 store migration no longer depends on the blocking KVStoreSync
migration path.

Co-Authored-By: HAL 9000
Move the existing in-memory test store into a shared module without
changing its behavior. This lets integration tests reuse it while
keeping the later async TestSyncStore change separate.

Co-Authored-By: HAL 9000
Keep the shared test store move as a pure code move by restoring the original comments and spacing.

Co-Authored-By: HAL 9000
Exercise async KVStore operations in TestSyncStore and filesystem migration tests while keeping the temporary sync comparison path until the final KVStoreSync removal.

Co-Authored-By: HAL 9000
Use a local async KVStore-backed monitor persister in store tests so the later blocking KVStore removal can drop the old synchronous MonitorUpdatingPersister path without mixing the test adapter into that change.

Co-Authored-By: HAL 9000
Drop the remaining synchronous KV store trait bounds and implementations. After the preceding migrations, custom stores only need to provide async KVStore persistence, and pathfinding score export also reads from the async store.

Co-Authored-By: HAL 9000
Compile the VSS persistence tests after the shared KV store helper moved to async persistence.

Co-Authored-By: HAL 9000
Add a crate-local runtime wrapper for store backends that need to keep their I/O isolated while shutting down safely from async contexts.

Co-Authored-By: HAL 9000
Document why store backends temporarily use isolated runtimes while synchronous APIs still bridge async persistence with block_on.

Co-Authored-By: HAL 9000
Keep VSS persistence on the isolated runtime without hiding each spawned task behind an extra helper, making the temporary runtime bridge easier to follow at each call site.

Co-Authored-By: HAL 9000
Keep PostgreSQL persistence on the isolated runtime without hiding each spawned task behind an extra helper, making the temporary runtime bridge easier to follow at each call site.

Co-Authored-By: HAL 9000
Enable Tokio's eager driver handoff when building with tokio_unstable so node-owned runtimes can use the dedicated driver handoff path where available.

Build binding artifacts and selected CI coverage with tokio_unstable so the cfg-gated runtime path remains exercised.

Co-Authored-By: HAL 9000
Document why Tokio eager driver handoff only narrows the block_on deadlock risk and why StoreRuntime remains necessary until fully async APIs remove the blocking bridge.

Co-Authored-By: HAL 9000
Give ldk-node-created runtime workers a stable ldk-node-prefixed thread name so runtime activity is easier to identify in debuggers and logs, matching the store runtime naming convention.

Co-Authored-By: HAL 9000
tnull added 5 commits June 8, 2026 17:06
Avoid nested runtime waits when peer, payment, and node metrics operations persist through async storage. This lets callers already in async contexts await those operations directly while keeping existing sync boundaries explicit.

Co-Authored-By: HAL 9000
Resolve VSS schema setup through async initialization on the caller runtime so VSS stores no longer create or retain a Tokio runtime.

Co-Authored-By: HAL 9000
Construct PostgreSQL storage on the caller runtime now that store-backed APIs are async, removing the store-owned runtime and shutdown path.

Co-Authored-By: HAL 9000
Expose startup, shutdown, builder, wallet, liquidity, and bindings entrypoints as async so store-backed persistence no longer needs local synchronous future driving. Update tests and benches to await the async surface and keep store persistence helpers on the async storage path.

Co-Authored-By: HAL 9000
Describe the Tokio runtime expectations for LDK Node callers so async API users know when runtime handles are reused and why runtime worker threads must remain available for node progress.

Co-Authored-By: HAL 9000
@ldk-reviews-bot

ldk-reviews-bot commented Jun 8, 2026

Copy link
Copy Markdown

👋 Hi! This PR is now in draft status.
I'll wait to assign reviewers until you mark it as ready for review.
Just convert it out of draft status when you're ready for review!

@tnull tnull marked this pull request as draft June 8, 2026 16:45
Comment thread benches/payments.rs
node_b.node_id(),
preimage,
None,
);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

await should be here

Comment thread src/payment/bolt11.rs

let liquidity_source = Arc::clone(&liquidity_source);
let invoice = self.runtime.block_on(async move {
let invoice = async move {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need this async move?

Comment thread src/wallet/mod.rs Outdated
}
};

if let Err(e) = self.runtime.block_on(self.update_payment_store(events)) {

@benthecarman benthecarman Jun 8, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need an async version of these traits in LDK?

Comment thread src/wallet/mod.rs
let wallet = self.clone();
let runtime = Arc::clone(&self.runtime);
let logger = Arc::clone(&self.logger);
runtime.spawn_background_task(async move {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems dangerous to not return an error if persists fails. Idk if we can really get around it though

Comment thread src/lib.rs
let logger = Arc::clone(&listening_logger);
let listening_addrs = listening_addresses.clone();
let listeners = self.runtime.block_on(async move {
let listeners = async move {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont really need this async move

Comment thread src/lib.rs
return Err(Error::NotRunning);
pub async fn stop(&self) -> Result<(), Error> {
{
let mut is_running_lock = self.is_running.write().expect("lock");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we still be using a sync lock for is_running?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants