Summary
There is a race condition in sequencer recovery where a recovery sequencer can begin producing blocks before P2P catchup has completed. If the original blocks later arrive via DA retrieval or P2P sync, the node correctly detects a double-sign/equivocation because the same signing key produced different headers at the same height.
The issue is not with double-sign detection itself. The new detection logic is correctly exposing an existing flaw in the recovery flow.
Observed Behavior
This was discovered while implementing double-sign detection in: #3310
TestSequencerRecoveryFromP2P began failing because the recovery sequencer can produce conflicting blocks before synchronization completes. The test already acknowledged this possibility:
recovery node produced its own blocks (P2P sync was not completed in time)
With the new double-sign detection logic, the node now correctly detects the equivocation and halts via sendCriticalError.
Before PR #3310, the conflicting headers were silently ignored due to pendingHeaders first-write-wins behavior.
Root Cause Analysis
Recovery synchronization is not a strict prerequisite for block production, which can begin before catchup has fully completed. If catchup times out, the sequencer may still start aggregation using incomplete state, allowing it to produce blocks at heights that already contain signed blocks from the original sequencer session.
Failure flow:
- Original sequencer produces N blocks using the genesis sequencer key and submits some to DA.
- Original sequencer stops.
- Fullnode has all N blocks via P2P gossip.
- Recovery sequencer starts with:
- a fresh store
- the same signing key
- P2P catchup enabled
- Catchup does not complete before timeout.
- Recovery sequencer proceeds with partial or empty local state and starts producing blocks.
- Original blocks later arrive through DA retrieval/P2P sync.
Syncer.detectDoubleSign detects:
- same signer
- same height
- different header hash
- Evidence is recorded and the node halts via
sendCriticalError.
Impact
A recovering sequencer can unintentionally equivocate against its own previously produced chain during startup. This results in:
- valid double-sign evidence being generated
- sequencer halt during recovery
- unsafe recovery behavior after restart/outage
Expected Behavior
A recovery sequencer should never begin block production until chain continuity has been fully established.
If synchronization cannot complete safely, recovery should fail rather than continue with partial state.
Suggested Fixes
- Gate block production until successful catchup completion.
- Verify chain continuity before producing blocks at height
H.
- Consider making recovery sync timeout fatal instead of continuing with partial state.
Current Workaround
TestSequencerRecoveryFromP2P has been temporarily skipped in PR #3310 pending a proper fix for the recovery race.
The test should be re-enabled once recovery guarantees synchronization safety before block production begins.
Summary
There is a race condition in sequencer recovery where a recovery sequencer can begin producing blocks before P2P catchup has completed. If the original blocks later arrive via DA retrieval or P2P sync, the node correctly detects a double-sign/equivocation because the same signing key produced different headers at the same height.
The issue is not with double-sign detection itself. The new detection logic is correctly exposing an existing flaw in the recovery flow.
Observed Behavior
This was discovered while implementing double-sign detection in: #3310
TestSequencerRecoveryFromP2P began failing because the recovery sequencer can produce conflicting blocks before synchronization completes. The test already acknowledged this possibility:
With the new double-sign detection logic, the node now correctly detects the equivocation and halts via
sendCriticalError.Before PR #3310, the conflicting headers were silently ignored due to
pendingHeadersfirst-write-wins behavior.Root Cause Analysis
Recovery synchronization is not a strict prerequisite for block production, which can begin before catchup has fully completed. If catchup times out, the sequencer may still start aggregation using incomplete state, allowing it to produce blocks at heights that already contain signed blocks from the original sequencer session.
Failure flow:
Syncer.detectDoubleSigndetects:sendCriticalError.Impact
A recovering sequencer can unintentionally equivocate against its own previously produced chain during startup. This results in:
Expected Behavior
A recovery sequencer should never begin block production until chain continuity has been fully established.
If synchronization cannot complete safely, recovery should fail rather than continue with partial state.
Suggested Fixes
H.Current Workaround
TestSequencerRecoveryFromP2Phas been temporarily skipped in PR #3310 pending a proper fix for the recovery race.The test should be re-enabled once recovery guarantees synchronization safety before block production begins.