Skip to content

Bwatch#9069

Draft
sangbida wants to merge 51 commits intoElementsProject:masterfrom
sangbida:async-block-processing
Draft

Bwatch#9069
sangbida wants to merge 51 commits intoElementsProject:masterfrom
sangbida:async-block-processing

Conversation

@sangbida
Copy link
Copy Markdown
Collaborator

Important

26.04 FREEZE March 11th: Non-bugfix PRs not ready by this date will wait for 26.06.

RC1 is scheduled on March 23rd

The final release is scheduled for April 15th.

Checklist

Before submitting the PR, ensure the following tasks are completed. If an item is not applicable to your PR, please mark it as checked:

  • The changelog has been updated in the relevant commit(s) according to the guidelines.
  • Tests have been added or modified to reflect the changes.
  • Documentation has been reviewed and updated as needed.
  • Related issues have been listed and linked, including any that this PR closes.
  • Important All PRs must consider how to reverse any persistent changes for tools/lightning-downgrade

rustyrussell and others added 30 commits April 19, 2026 17:31
Like bitcoin_txid, they are special backwards-printed snowflakes.

Thanks Obama!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These helper functions decode hex strings from JSON into big-endian 32-bit and 64-bit values, useful for parsing datastore entries exposing these into a more common space so they can be used by bwatch in the future.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
bwatch is an async block scanner that consumes blocks from bcli or any
other bitcoind interface and communicates with lightningd by sending
it updates. In this commit we're only introducing the plugin and some
files that we will populate in future commits.
This wire file primarily contains datastructures that is used to serialize data for storing in the datastore. We have 2 types of datastores for bwatch.
The block history datastore and the watch datastore. For block history we store height, the hash and the hash of the previous block.
For watches we have 4 types of watches - utxo, scriptpubkey, scid and blockdepth watches, each of these have their unique info stored in the datastore. The common info for all watches includes the start block and the list of owners interested in watching.
We have 4 types of watches: utxo (outpoint), scriptpubkey, scid and
blockdepth. Each gets its own hash table with a key shape that makes
lookups direct.
bwatch keeps a tail of recent blocks (height, hash, prev hash) so it
can detect and unwind reorgs without re-fetching from bitcoind. The
datastore key for each block is zero-padded to 10 digits so
listdatastore returns blocks in ascending height order. On startup
we replay the stored history and resume from the most recent block.
Each watch (and its set of owners) is serialized through the wire
format from the earlier commit and stored in the datastore. On startup
we walk each type's prefix and reload the watches into their
respective hash tables, so a restart resumes watching the same things
without anyone re-registering.
bwatch_add_watch and bwatch_del_watch are the high-level entry points
the RPCs (added in a later commit) use. Adding a watch that already
exists merges the owner list and lowers start_block if the new request
needs to scan further back, so a re-registering daemon (e.g. onchaind
on restart) doesn't lose missed events. Removing a watch drops only
the requesting owner; the watch itself is removed once the owner list
is empty.
Add the chain-polling loop. A timer fires bwatch_poll_chain, which calls
getchaininfo to learn bitcoind's tip; if we're behind, we fetch the next
block via getrawblockbyheight, append it to the in-memory history and
persist it to the datastore. After each successful persist we reschedule
the timer at zero delay so we keep fetching back-to-back until we catch
up to the chain tip. Once getchaininfo reports no new block, we settle
into the steady-state cadence (30s by default, tunable via the
--bwatch-poll-interval option).

This commit only handles the happy path. Reorg detection, watchman
notifications and watch matching land in subsequent commits.
After bwatch persists a new tip, send a block_processed RPC to watchman
(lightningd) with the height and hash. bwatch only continues polling
for the next block once watchman has acknowledged that it has also
processed the new block height on its end.

This matters for crash safety: on restart we treat watchman's height as
the floor and re-fetch anything above it, so any block we acted on must
be visible to watchman before we move on.

If watchman isn't ready yet (e.g. lightningd still booting) the RPC
errors out non-fatally; we just reschedule and retry.
When handle_block fetches the next block, validate its parent hash
against our current tip. If they disagree we're seeing a reorg: pop our
in-memory + persisted tip via bwatch_remove_tip, walk the history one
back, and re-fetch from the new height. Each fetch may itself reorg
further, so the loop naturally peels off as many stale tips as needed
until the chain rejoins.

After every rollback, tell watchman the new tip via
revert_block_processed so its persisted height tracks bwatch's. If we
crash before the ack lands, watchman's stale height will be higher than
ours on restart, which retriggers the rollback.

If the rollback exhausts our history (we rolled back past the oldest
record we still hold) we zero current_height/current_blockhash and let
the next poll re-init from bitcoind's tip.

Notifying owners that their watches were reverted lands in a subsequent
commit.
Add two RPCs for surfacing watches to lightningd on a new block or
reorg.

bwatch_send_watch_found informs lightningd of any watches that were
found in the current processed block.  The owner is used to
disambiguate watches that may pertain to multiple subdaemons.

bwatch_send_watch_revert is sent in case of a revert; it informs the
owner that a previously reported watch has been rolled back.

These functions get wired up in subsequent commits.
After every fetched block, walk each transaction and fire watch_found
for matching scriptpubkey outputs and spent outpoints.

Outputs are matched by hash lookup against scriptpubkey_watches; inputs
by reconstructing the spent outpoint and looking it up in
outpoint_watches.
After the per-tx scriptpubkey/outpoint pass, walk every scid watch and
fire watch_found for any whose encoded blockheight matches the block
just processed.

The watch's scid encodes the expected (txindex, outnum), so we jump
straight there without scanning. If the position is out of range
(txindex past the block, or outnum past the tx) we send watch_found
with tx=NULL, which lightningd treats as the "not found" case.
Subdaemons like channel_open and onchaind care about confirmation
depth, not the underlying tx. Walk blockdepth_watches on every new
block and send watch_found with the current depth to each owner.

This is what keeps bwatch awake in environments like Greenlight,
where we'd otherwise prefer to hibernate: as long as something is
waiting on a confirmation milestone, the blockdepth watch holds the
poll open; once it's deleted, we're free to sleep again.

Depth fires before the per-tx scan so restart-marker watches get a
chance to spin up subdaemons before any outpoint hits land for the
same block. Watches whose start_block is ahead of the tip are stale
(reorged-away, awaiting delete) and skipped.
On init, query bcli for chain name, headercount, blockcount and IBD
state, then forward the result to watchman via the chaininfo RPC
before bwatch starts its normal poll loop. Watchman uses this to
gate any work that depends on bitcoind being synced.

If bitcoind's blockcount comes back lower than our persisted tip,
peel stored blocks off until they line up so watchman gets a
consistent picture. During steady-state polling the same case is
handled by hash-mismatch reorg detection inside handle_block; this
shortcut only matters at startup, before we've fetched anything.

If bcli or watchman is not yet ready, log and fall back to scheduling
the poll loop anyway so init never stalls.

bwatch_remove_tip is exposed in bwatch.h so the chaininfo path in
bwatch_interface.c can use it.
addscriptpubkeywatch and delscriptpubkeywatch are how lightningd asks
bwatch to start/stop watching an output script for a given owner.
addoutpointwatch and deloutpointwatch are how lightningd asks bwatch
to start/stop watching a specific (txid, outnum) for a given owner.
addscidwatch and delscidwatch are how lightningd asks bwatch to
start/stop watching a specific short_channel_id for a given owner.
The scid pins the watch to one (block, txindex, outnum), so on each
new block we go straight to that position rather than scanning.
addblockdepthwatch and delblockdepthwatch are how lightningd asks
bwatch to start/stop a depth-tracker for a given (owner, start_block).
start_block doubles as the watch key and the anchor used to compute
depth = tip - start_block + 1 on every new block.
listwatch returns every active watch as a flat array. Each entry
carries its type-specific key (scriptpubkey hex, outpoint, scid
triple, or blockdepth anchor) plus the common type / start_block /
owners fields, so callers can dispatch on the per-type key without
parsing the type string first.

Mostly used by tests and operator tooling to inspect what bwatch
is currently tracking.
To support rescans (added next), bwatch_process_block_txs and
bwatch_check_scid_watches gain a `const struct watch *w` parameter
so the caller can ask the scanner to check just one watch instead
of all of them.

When a new watch is added with start_block <= current_height (say
the watch starts at block 100 but bwatch is already at 105) we
need to replay blocks 100..105 for that watch alone — not re-scan
every active watch over those blocks.

  w == NULL  -> check every active watch (normal polling)
  w != NULL  -> check only that one watch (rescan)
bwatch_start_rescan(cmd, w, start_block, target_block) replays
blocks from start_block..target_block for a single watch w (or
for all watches if w is NULL).

The rescan runs asynchronously: fetch_block_rescan ->
rescan_block_done -> next fetch, terminating with rescan_complete
(which returns success for an RPC-driven rescan and aux_command_done
for a timer-driven one).

Nothing calls bwatch_start_rescan yet; the add-watch RPCs wire it
up next.
bwatch_add_watch returns the watch it created (or found); each
addwatch RPC now passes that into add_watch_and_maybe_rescan,
which:

  - returns success immediately if start_block > current_height
    (the watch only cares about future blocks), and
  - otherwise calls bwatch_start_rescan over
    [start_block, current_height] for that one watch and leaves
    the RPC pending until the rescan completes.

This lets callers add a watch for an event that already confirmed
(e.g. a channel funding tx some blocks back) and still get a
watch_found.
Default poll cadence is 30s; tests would otherwise wait that long
between block_processed notifications. Drop to 500ms so block-by-
block assertions don't sit idle.
watchman is the lightningd-side counterpart to the bwatch plugin
landed in the previous group: it tracks how far we've processed
the chain, queues outbound watch ops while bwatch is starting up,
and dispatches watch_found/watch_revert/blockdepth notifications
to subdaemon-specific handlers.

This commit adds only the public surface (struct watchman, the
three handler typedefs, and prototypes for watchman_new,
watchman_ack, watchman_replay_pending) plus an empty watchman.c
so the header is exercised by the build. Definitions land in
subsequent commits.
Introduce the minimal storage scaffolding for the watchman module:

- db_set_blobvar / db_get_blobvar helpers for persisting binary
  values (e.g. block hashes) in the SQL `vars` table.
- load_tip(): recover last_processed_height and last_processed_hash
  from the wallet db.
- apply_rescan(): honour --rescan by adjusting the loaded tip
  downward (negative = absolute height, positive = N blocks back).
- watchman_new(): allocate the struct, initialise the pending-op
  array, and call load_tip + apply_rescan.

Wire the watchman field into struct lightningd via a forward
declaration; instantiation at startup lands in a later commit
along with the rest of the wiring.
sangbida added 21 commits April 20, 2026 21:08
Add an optional callback on `struct plugins` that fires whenever a
plugin transitions to INIT_COMPLETE (i.e. its `init` response has
arrived).  Invoked from plugin_config_cb just before
notify_plugin_started.

This lets subsystems react to plugin readiness without polluting the
generic plugin lifecycle.  The watchman module uses it to replay any
pending watch operations to bwatch as soon as bwatch is up.
Introduce the outbound RPC path from watchman to the bwatch plugin
plus the ack lifecycle that drops a pending op once bwatch confirms
it.

- struct pending_op carries an op_id of the form "{method}:{owner}"
  (e.g. "addscriptpubkeywatch:wallet/p2wpkh/42"); method and owner
  are recoverable without a separate field.
- Datastore helpers (make_key, db_save, db_remove) persist pending
  ops at ["watchman", "pending", op_id] for crash recovery.
- send_to_bwatch finds the bwatch plugin via find_plugin_for_command
  on the method name; if bwatch is not yet INIT_COMPLETE, the send is
  silently dropped (the op stays queued and will be replayed when
  bwatch comes up).  Otherwise it builds a JSON-RPC request with the
  owner suffix and the caller-supplied json_params body, registers
  bwatch_ack_response as the callback, and sends it.
- watchman_ack searches pending_ops by op_id; on a hit it removes the
  datastore entry and drops the in-memory op.

db_save and send_to_bwatch are marked __attribute__((unused)) here
because their callers (enqueue_op, watchman_replay_pending) land in
the next commit; the markers are removed there.
Both bwatch and watchman must be crash-resistant: a watch_send or an
add_watch/del_watch op may be in flight when lightningd crashes, and
neither side is allowed to lose it.  We solve this by persisting every
pending op to the datastore in enqueue_op and dropping it from the
datastore in watchman_ack.  On startup load_pending_ops rebuilds the
in-memory queue from the datastore, and watchman_on_plugin_ready
replays it once bwatch reaches INIT_COMPLETE.

watchman_add cancels any prior add for the same owner; watchman_del
cancels any pending add for the same owner before queueing the
delete.  This keeps the queue from accumulating stale or
self-cancelling op pairs across restarts.
Register the two startup RPCs that bwatch calls on launch:

- getwatchmanheight: bwatch asks how far we've already processed the
  chain so it knows what height to (re)scan from.  Returns
  {height, blockhash?} from wm->last_processed_{height,hash}.
- chaininfo: bwatch reports the chain name, header/block counts, and
  IBD status.  We fatal() on a network mismatch (wrong bitcoind),
  toggle bitcoind->synced based on IBD/header-vs-block lag, fire
  notify_new_block on the transition to synced, and remember the
  blockcount on watchman.
bwatch's polling loop calls these two RPCs to keep watchman's chain
tip in sync.

- block_processed fires after bwatch finishes scanning a block.
  Watchman advances last_processed_{height,hash}, persists them, and
  fires notify_new_block.
- revert_block_processed fires when bwatch detects a reorg.
  Watchman rewinds last_processed_{height,hash} to the supplied
  values and persists them.
Add depth_handlers[] / watch_handlers[] dispatch tables keyed by owner
prefix (sentinel-only for now; entries land alongside their handlers in
later commits) and the json_watch_found / json_watch_revert RPCs that
bwatch calls into.
Public wrappers around the internal watchman_add/watchman_del helpers
that callers (wallet, channel) will use to register/remove
WATCH_SCRIPTPUBKEY entries.  Drops the unused-attr from watchman_add
and watchman_del now that they have real callers.
Public wrappers around watchman_add/watchman_del for WATCH_OUTPOINT
entries.  Used by channel/onchaind/wallet to watch funding outpoints,
HTLC outputs and UTXOs for spends.
Public wrappers around watchman_add/watchman_del for WATCH_SCID
entries.  Used by gossipd to confirm announced channels by asking
bwatch to fetch the output at a given short_channel_id position.
Public wrappers around watchman_add/watchman_del for WATCH_BLOCKDEPTH
entries.  These fire once per new block while a tx accumulates
confirmations, used by channeld for funding-depth tracking and by
onchaind to drive CSV/HTLC maturity timers.
The next commits move wallet UTXO and tx tracking off chaintopology and
onto bwatch.  bwatch doesn't maintain a blocks table, but the legacy
utxoset, transactions and channeltxs tables all have FOREIGN KEY
references into blocks(height) (CASCADE / SET NULL), so we can't just
retarget the existing tables.

Instead, introduce parallel tables (our_outputs, our_txs) without the
blocks(height) FK.  The new bwatch-driven code writes only to these,
the legacy tables stay populated by the existing code path during this
release so downgrade still works, and a future release can drop them
once we're past the downgrade window.

Schema only here — wallet handlers that write into these tables and the
backfill from utxoset/transactions land in subsequent commits.
Lands the helpers used by the upcoming bwatch-driven scriptpubkey watch
handlers: bwatch_got_utxo, wallet_watch_scriptpubkey_common, the
our_outputs/our_txs writers, their undo helpers, and the shared revert
handler used by the p2wpkh/p2tr/p2sh_p2wpkh dispatch entries.  Also
adds the owner_wallet_utxo() owner-string constructor.

All static helpers are __attribute__((unused)) until the typed handler
commits wire them into watchman's dispatch table.  Public functions
(wallet_add_our_output, wallet_add_our_tx, wallet_del_txout_annotation,
wallet_del_tx_if_unreferenced, wallet_scriptpubkey_watch_revert) are
dead code for the same reason.

Coexists with the legacy got_utxo() / wallet_transaction_add() that
write to utxoset/transactions; the bwatch path uses renamed variants
(bwatch_got_utxo, wallet_add_our_tx) so both tables stay populated for
the downgrade window.
Add the wallet/utxo/<txid>:<outnum> dispatch entry: on watch_found,
mark the UTXO spent in our_outputs, refresh the spending tx in
our_txs, and emit a withdrawal coin movement; on watch_revert, clear
spendheight so the UTXO becomes unspent again.
In bwatch_got_utxo, register a perennial scriptpubkey watch on every
unconfirmed change output via the typed owner_wallet_p2wpkh /
owner_wallet_p2tr constructors, so we still see the confirmation
notification.

The watch uses UINT32_MAX as the start_block sentinel so bwatch keeps
it perennially armed: never skip on reorg, never rescan.
(ld, db) → keyindex / addrtype lookup over the BIP32 / BIP86 derivation
range; lower-level form of wallet_can_spend that doesn't need a fully
constructed struct wallet, so migrations (which run before ld->wallet
is wired up) can call it.
Walk legacy utxoset and back-populate our_outputs with the wallet-owned
rows (skipping outputs that don't derive from any HD key, since those
are channel funding outputs or gossip watches, not wallet UTXOs), then
bulk-copy transactions into our_txs.

This release stops updating utxoset/transactions but leaves the rows
in place, so no revert is needed: a downgraded binary just resumes
from the height the legacy tables were frozen at.
Add init_wallet_scriptpubkey_watches in wallet.c that walks every HD
key (BIP32 + BIP86) up to {bip32,bip86}_max_index + keyscan_gap and
registers a bwatch watch for each.  Call it from main() right after
setup_topology, alongside watchman_new.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants