Skip to content

Intelligently advance nonce#95

Open
sevenmachines wants to merge 2 commits intomainfrom
fix/stop-nonce-hopping
Open

Intelligently advance nonce#95
sevenmachines wants to merge 2 commits intomainfrom
fix/stop-nonce-hopping

Conversation

@sevenmachines
Copy link
Copy Markdown
Contributor

  • fix: stop nonce-hopping on chunk upload retries
  • fix: verify chunks via IPFS gateway after nonce-advance fallback

Description

Type

  • Bug fix
  • Feature
  • Breaking change
  • Documentation
  • Chore

Package

  • @parity/dotns-cli
  • Root/monorepo
  • Documentation

Related Issues

Fixes

Checklist

Code

  • Follows project style
  • bun run lint passes
  • bun run format passes
  • bun run typecheck passes

Documentation

  • README updated if needed
  • Types updated if needed

Breaking Changes

  • No breaking changes
  • Breaking changes documented below

Breaking changes:

Testing

How to test:

Notes

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 16, 2026

CI Summary

Check Result
Lint Passed
Format Failed - 0 files need formatting
Typecheck Passed
Build Passed
Release Passed
Deploy Example Passed
PR Title Failed
Labels Passed
Test Passed - 140 passed, 0 failed

Format - Failed

0 files need formatting

View run

$ bunx prettier --check .
Checking formatting...
[�[33mwarn�[39m] tests/unit/bulletin/nonceHopping.test.ts
[�[33mwarn�[39m] Code style issues found in the above file. Run Prettier with --write to fix.
error: script "format" exited with code 1

Release - Passed

Test this PR

Download artifact (GitHub CLI required):

gh run download 24507419192 -n cli-release-0.0.0-pr.95 -R paritytech/dotns-sdk

Install globally:

npm install -g ./parity-dotns-cli-0.0.0-pr.95.tgz

Verify:

dotns --help
Deploy Example — Passed
Stage Status
✓ Site validation Site validated
✓ Deploy Deployed
Property Value
Domain pr95.dotns-example-site.dot
CID bafybeicx555prsudvmqsdrap7keoilryrudwycdx3ody2rejwqwjwp52u4
URL (dot.li) https://dot.li/pr95.dotns-example-site.dot
URL (direct) https://pr95.dotns-example-site.dot
Duration 156s

View run

PR Title

Use: type(scope): description

Example: feat(cli): add bulletin upload command

View run

Labels

pkg: cli, scope: bulletin, type: test

Test - Passed

140 passed, 0 failed across 140 tests.

View run

$ bun test tests/unit/
bun test v1.2.6 (8ebd5d53)

::group::tests/unit/text/textHelp.test.ts:
(pass) text --help lists subcommands and auth options [13.00ms]
(pass) text view --help shows name and key arguments [3.00ms]
(pass) text set --help shows name, key, and value arguments [3.00ms]

::endgroup::

::group::tests/unit/content/contentHelp.test.ts:
result:  {
  exitCode: 0,
  standardOutput: "Usage: dotns content [options] [command]\n\nManage domain content hashes\n\nOptions:\n  --rpc <wsUrl>               WebSocket RPC endpoint (env: DOTNS_RPC)\n  --keystore-path <path>      Keystore path (env: DOTNS_KEYSTORE_PATH)\n  --min-balance <pas>         Minimum balance in PAS (env:\n                              DOTNS_MIN_BALANCE_PAS)\n  --account <name>            Keystore account name (default: keystore default)\n  --password <pw>             Keystore password (env: DOTNS_KEYSTORE_PASSWORD)\n  -m, --mnemonic <phrase>     BIP39 mnemonic phrase (env: DOTNS_MNEMONIC)\n  -k, --key-uri <uri>         Substrate key URI (env: DOTNS_KEY_URI)\n  -h, --help                  display help for command\n\nCommands:\n  view [options] <name>       View domain content hash\n  set [options] <name> <cid>  Set domain content hash (IPFS CID)\n  help [command]              display help for command\n",
  standardError: "",
  combinedOutput: "Usage: dotns content [options] [command]\n\nManage domain content hashes\n\nOptions:\n  --rpc <wsUrl>               WebSocket RPC endpoint (env: DOTNS_RPC)\n  --keystore-path <path>      Keystore path (env: DOTNS_KEYSTORE_PATH)\n  --min-balance <pas>         Minimum balance in PAS (env:\n                              DOTNS_MIN_BALANCE_PAS)\n  --account <name>            Keystore account name (default: keystore default)\n  --password <pw>             Keystore password (env: DOTNS_KEYSTORE_PASSWORD)\n  -m, --mnemonic <phrase>     BIP39 mnemonic phrase (env: DOTNS_MNEMONIC)\n  -k, --key-uri <uri>         Substrate key URI (env: DOTNS_KEY_URI)\n  -h, --help                  display help for command\n\nCommands:\n  view [options] <name>       View domain content hash\n  set [options] <name> <cid>  Set domain content hash (IPFS CID)\n  help [command]              display help for command\n",
}
(pass) content --help lists subcommands and auth options [3.00ms]
(pass) content view --help shows name argument and options [3.00ms]
(pass) content set --help shows name and cid arguments [1.00ms]

::endgroup::

::group::tests/unit/auth/auth.test.ts:
(pass) auth set creates keystore and stores multiple accounts [320.00ms]
(pass) auth set accepts account names with special characters [467.00ms]
(pass) auth list reports missing keystore [3.00ms]
(pass) auth list shows all accounts and auth types [310.00ms]
(pass) auth use switches default account [482.00ms]
(pass) auth remove last account clears default [82.00ms]
(pass) auth remove preserves remaining accounts and reassigns default [240.00ms]
(pass) auth clear deletes all accounts [95.00ms]

::endgroup::

::group::tests/unit/auth/authHelp.test.ts:
(pass) root help lists auth command [2.00ms]
(pass) auth help shows options and subcommands [2.00ms]
(pass) auth set help shows all options [1.00ms]
(pass) auth list help shows options [2.00ms]
(pass) auth use help shows options [1.00ms]
(pass) auth remove help shows options [1.00ms]
(pass) auth clear help shows options [2.00ms]
(pass) auth parses keystore-path option [1.00ms]
(pass) auth parses password option [2.00ms]
(pass) auth set parses account option [1.00ms]
(pass) auth set parses mnemonic option [2.00ms]
(pass) auth set parses key-uri option [2.00ms]
(pass) auth help command shows help [1.00ms]
(pass) auth help set shows set command help [2.00ms]
(pass) auth help list shows list command help [1.00ms]
(pass) auth help use shows use command help [1.00ms]
(pass) auth help remove shows remove command help [2.00ms]
(pass) auth help clear shows clear command help [1.00ms]

::endgroup::

::group::tests/unit/auth/authRevert.test.ts:
(pass) auth set rejects account name with forward slash [3.00ms]
(pass) auth set rejects account name with backslash [1.00ms]
(pass) auth set rejects account name that is just a dot [1.00ms]
(pass) auth set rejects account name that is double dots [1.00ms]
(pass) auth set rejects account name starting with dot [1.00ms]
(pass) auth set rejects account name ending with dot [1.00ms]
(pass) auth set rejects account name with special characters [13.00ms]
(pass) auth set rejects account name that is too long [2.00ms]
(pass) auth use rejects non-existent account [80.00ms]
(pass) auth remove rejects non-existent account [88.00ms]

::endgroup::

::group::tests/unit/store/storeHelp.test.ts:
(pass) store --help lists subcommands [1.00ms]
(pass) store info --help shows auth options [2.00ms]
(pass) store list --help shows options [1.00ms]
(pass) store get --help shows key argument [1.00ms]
(pass) store set --help shows key and value arguments with auth [1.00ms]
(pass) store delete --help shows key argument with auth [1.00ms]
(pass) store check --help shows address argument [1.00ms]
(pass) store authorize --help shows address argument with auth [1.00ms]
(pass) store unauthorize --help shows address argument [1.00ms]
(pass) store authorize-controller --help shows address argument [2.00ms]
(pass) store unauthorize-controller --help shows address argument [1.00ms]
(pass) store ensure-auth --help shows description and auth options [1.00ms]

::endgroup::

::group::tests/unit/pop/setPopHelp.test.ts:
(pass) root help lists pop command [2.00ms]
(pass) pop help shows commands and description [1.00ms]
(pass) pop help shows auth options [1.00ms]
(pass) pop set help shows status parameter [1.00ms]
(pass) pop set help shows auth options [1.00ms]
(pass) pop info help shows description [1.00ms]
(pass) pop info help shows auth options [1.00ms]
(pass) pop set parses rpc option at pop level [1.00ms]
(pass) pop set parses rpc option at set level [1.00ms]
(pass) pop set parses keystore-path option [1.00ms]
(pass) pop set parses account option at pop level [1.00ms]
(pass) pop set parses account option at set level [1.00ms]
(pass) pop set parses password option [1.00ms]
(pass) pop set parses mnemonic option [1.00ms]
(pass) pop set parses key-uri option [1.00ms]
(pass) pop info parses auth options at pop level [2.00ms]
(pass) pop info parses auth options at info level [1.00ms]
(pass) pop set parses mixed options across levels [1.00ms]
(pass) pop help command shows pop help [1.00ms]
(pass) pop help set shows set help [1.00ms]
(pass) pop help info shows info help [1.00ms]

::endgroup::

::group::tests/unit/account/accountHelp.test.ts:
(pass) account --help lists subcommands including is-mapped, is-whitelisted, whitelist [2.00ms]
(pass) account is-mapped --help shows address argument and --json [1.00ms]
(pass) account is-whitelisted --help shows address argument and --json [1.00ms]
(pass) account whitelist --help shows address argument, --remove, and --json [1.00ms]
(pass) account is alias works for is-mapped [1.00ms]
(pass) account iw alias works for is-whitelisted [1.00ms]

::endgroup::

::group::tests/unit/cli/reporter.test.ts:
(pass) cli reporter > stream reporter emits durable progress lines
(pass) cli reporter > withConsoleToStderr redirects console and stdout writes

::endgroup::

::group::tests/unit/bulletin/uploadManifest.test.ts:
(pass) upload manifest resume behavior > returns stale manifest when fingerprint does not match [2.00ms]
(pass) upload manifest resume behavior > deduplicates completed blocks by index

::endgroup::

::group::tests/unit/bulletin/uploadProfiling.test.ts:
(pass) upload profiler > writes schema-complete profile report with peak aggregation [19.00ms]
(pass) upload profiler > default profile path is deterministic for a given fingerprint [1.00ms]

::endgroup::

::group::tests/unit/bulletin/bulletinHelp.test.ts:
(pass) root help lists bulletin command [2.00ms]
(pass) bulletin help shows commands and description [1.00ms]
(pass) bulletin upload help shows all options [1.00ms]
(pass) bulletin upload help shows default values [1.00ms]
(pass) bulletin authorize help shows all options [1.00ms]
(pass) bulletin authorize help shows default values [1.00ms]
(pass) bulletin history help shows options [1.00ms]
(pass) bulletin history:remove help shows usage [1.00ms]
(pass) bulletin history:clear help shows description [1.00ms]
(pass) bulletin help command shows bulletin help [1.00ms]
(pass) bulletin help upload shows upload help [2.00ms]
(pass) bulletin help authorize shows authorize help
(pass) bulletin status help shows all options [1.00ms]
(pass) bulletin help status shows status help
(pass) bulletin list alias works [1.00ms]
(pass) bulletin verify help shows usage [1.00ms]
(pass) bulletin help verify shows verify help [1.00ms]

::endgroup::

::group::tests/unit/bulletin/nonceHopping.test.ts:
(pass) nonce-hopping: old behavior (duplicate nonces) > reassigning nonces on each wave retry wastes nonces and creates duplicates [1.00ms]
(pass) nonce-hopping: new behavior (persistent nonces) > re-queued chunks keep their original nonce, no duplicates
(pass) nonce-hopping: new behavior (persistent nonces) > chunks whose nonces were NOT consumed are re-queued with the SAME nonce [1.00ms]
(pass) nonce-hopping: runWaveWithRetries preserves chunk identity > retried chunks within a wave keep the same nonce from the waveNonces map [4.00ms]
(pass) nonce-hopping: quantifying the waste > old behavior wastes N×W nonces for N chunks over W waves
(pass) nonce-hopping: quantifying the waste > models the CI run: chunk 3 stored 6× under old behavior, 1× under new

::endgroup::

::group::tests/unit/register/registerHelp.test.ts:
(pass) root help lists register command [2.00ms]
(pass) register help shows subcommands [1.00ms]
(pass) register domain help shows options [1.00ms]
(pass) register subname help shows options [1.00ms]
(pass) register domain parses status none [1.00ms]
(pass) register domain parses status lite [1.00ms]
(pass) register domain parses status full [1.00ms]
(pass) register domain parses reverse flag [1.00ms]
(pass) register domain parses governance flag
(pass) register domain parses owner option [1.00ms]
(pass) register domain parses transfer with destination [1.00ms]
(pass) register domain parses account option [1.00ms]
(pass) register domain parses keystore-path option [1.00ms]
(pass) register domain parses password option [1.00ms]
(pass) register domain parses mnemonic option [1.00ms]
(pass) register domain parses key-uri option [1.00ms]
(pass) register domain parses commitment-buffer option [1.00ms]
(pass) register domain parses commitment-buffer alias --cb [1.00ms]
(pass) register subname parses name and parent [1.00ms]
(pass) register subname parses owner option [1.00ms]
(pass) getCommitmentBufferSeconds defaults to 6 when env is not set
(pass) getCommitmentBufferSeconds reads from DOTNS_COMMITMENT_BUFFER env variable
(pass) COMMITMENT_POLL_INTERVAL_MS is 2000
(pass) COMMITMENT_POLL_TIMEOUT_MS is 30000

::endgroup::

::group::tests/unit/lookup/lookupHelp.test.ts:
(pass) lookup --help lists subcommands and auth options [3.00ms]
(pass) lookup name --help shows label argument and options [1.00ms]
(pass) lookup owner-of --help shows label argument and options [1.00ms]
(pass) lookup transfer --help shows label argument and destination option [1.00ms]
(pass) lookup transfer parses destination at transfer level [1.00ms]
(pass) lookup transfer parses auth options at lookup level [1.00ms]

::endgroup::

 140 pass
 0 fail
 541 expect() calls
Ran 140 tests across 15 files. [2.75s]

When a chunk's signSubmitAndWatch subscription times out, the uploader
fetches a fresh nonce and resubmits the same chunk. The abandoned
transaction is often still included on chain (the nonce advances), so
each retry stores a duplicate 2 MiB chunk and adds another transaction
to the pool. Under contention this creates a feedback loop: duplicate
txs inflate the pool, the collator's txpool readiness timeout fires,
blocks are produced without the pending chunk, and the client retries
again — we observed a single chunk stored 6 times in one CI run,
wasting ~36 s of block capacity and driving the pool to 70 txs.

Fix: (1) add a nonce-advance fallback to storeContentOnBulletin — on
timeout, check system_accountNextIndex via a fresh WS; if the nonce
advanced past the submitted one, the tx was included, resolve as
success. (2) track assigned nonces across waves in a persistent map so
re-queued chunks keep their original nonce instead of allocating a new
one. (3) on wave failure, check each chunk's nonce against the current
account nonce before re-queuing — chunks whose nonces were consumed are
marked complete rather than resubmitted.
The nonce-advance fallback can falsely mark a chunk as completed if
another process consumed the nonce (concurrent uploader on the same
account, or a dispatch error that consumed the nonce without storing).

When a store call resolves via the nonce-advance fallback (storedIndex
and blockHash both undefined), immediately verify the CID resolves via
the IPFS gateway (2 attempts, 3s apart). If it doesn't resolve, throw
so the chunk is retried. Chunks confirmed via the normal subscription
path are trusted and skip verification.

Also adds nonceHopping.test.ts demonstrating old vs new nonce behavior.
@sevenmachines sevenmachines force-pushed the fix/stop-nonce-hopping branch from 9a6e76e to 663093a Compare April 16, 2026 11:21
@paritytech paritytech deleted a comment from cla-bot-2021 bot Apr 16, 2026
@sevenmachines
Copy link
Copy Markdown
Contributor Author

Concurrent 16 MB chunked upload on bulletin-paseo, same account, two uploaders running simultaneously.

Key metrics

Metric Without fix (Apr 15) With fix (Apr 16) Improvement
Peak tx pool size 70 6 12× lower
Max extrinsics/block 42 (drain burst) 4 No drain bursts
Duplicate chunk stores ~6× per stuck chunk 0 Eliminated
Txpool timeout pattern Sustained alternating 12/2 Scattered, no alternation No empty blocks
Block proposal time p99 0.49s spikes 0.24s steady No proposer stalls
Total upload time 6m17s + OOM crash 3m05s 2× faster, no crash
Nonce-advance fallbacks N/A 2 (both correct) New capability

Tx pool depth during upload

Without fix (Apr 15, 16:39–16:53 UTC)

16:39 ██ 2 (baseline)
16:41 ██████████ 11
16:42 ████████████████████████████████████████████████████████████ 60
16:43 ██████████████████████████████████████████████████████████████████████ 70 ← PEAK
16:44 ████████████████████████████████ 32
16:45 ██████████████████████████████████████████████████████████████████████ 70
16:47 ████████████████████████████████ 32
16:49 ██████████████ 14
16:51 ██ 2 (back to normal)

With fix (Apr 16, 11:05–11:13 UTC)

11:05 ░ 0 (baseline)
11:06 █ 1
11:07 █ 1
11:08 ███ 3
11:09 ██████ 6 ← PEAK
11:10 ████ 4
11:11 ░ 0 (back to normal)

Extrinsics per proposed block

Without fix — alternating full/empty pattern

Block 1046726 ████████████ 12 (11ms) full
Block 1046728 ████████████ 12 (9ms) full
Block 1046730 ████████████ 12 (43ms) full, slowing
Block 1046734 ██ 2 (229ms) ← TXPOOL TIMEOUT — only inherents
Block 1046735 ██ 2 (228ms) ← TXPOOL TIMEOUT
Block 1046737 █████████████ 13 (75ms) recovery
Block 1046738 ██ 2 (229ms) ← TXPOOL TIMEOUT
Block 1046752 ████████████████████████████████ 32 (80ms) drain burst
Block 1046754 ██████████████████████████████████████████ 42 (156ms) drain burst

With fix — steady 2–4 extrinsics, no timeouts

Block 1056849 ███ 3 (66ms)
Block 1056851 ███ 3 (68ms)
Block 1056853 ███ 3 (67ms)
Block 1056855 ███ 3 (69ms)
Block 1056857 ███ 3 (4ms)
Block 1056859 ██ 2 (3ms)
Block 1056871 ████ 4 (128ms)
Block 1056873 ███ 3 (67ms)

Root cause and fix

Problem: When a chunk's signSubmitAndWatch subscription timed out, the
uploader fetched a fresh nonce and resubmitted. The abandoned transaction was
often already included (nonce advanced), so each retry stored a duplicate 2 MiB
chunk. Under contention this created a feedback loop: duplicate txs inflated
the pool → txpool readiness timeout fired → proposer produced empty blocks →
inclusion latency increased → more timeouts → more duplicates.

Fix (two parts):

  1. Persistent nonce tracking — chunks keep their original nonce across wave
    retries instead of getting a fresh one. Eliminates duplicate submissions.
  2. Nonce-advance fallback — on timeout, check system_accountNextIndex;
    if the nonce advanced past the submitted one, the tx was included. Mark as
    complete instead of resubmitting. Verified inline via IPFS gateway HEAD
    request to catch false positives from concurrent account usage.

CI run references

@sevenmachines
Copy link
Copy Markdown
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant