Fix flaky handshake + set bootargs explicitly for NAND install#63
Merged
Fix flaky handshake + set bootargs explicitly for NAND install#63
Conversation
Three resilience fixes for the transient handshake failure observed on hi3516av200 (one in N flashes failed at "Failed to send DDR step" with no code change between successes): 1. Distinct DDR-step error attribution. _send_ddr_step now returns a phase-specific error string instead of a bool, so callers can tell handshake-timeout from PRESTEP0/DDRSTEP0/ PRESTEP1 frame failures. Previously every failure surfaced as "Failed to send DDR step" — misleading when the actual failure was the sendFrameForStart handshake never latching. 2. Drain-until-silent replaces fixed sleep + flush. Transport gets a drain_until_silent(quiet_period, max_wait) helper that loops reading until the line stays quiet long enough. Session uses it after power_cycle. More robust than a fixed 2s sleep + tcflush — a powered-off chip can't transmit, so silence is a deterministic ready signal, and we don't lose late-arriving stale bytes that beat the flush. 3. Retry the handshake/DDR phase on transient failure. RecoverySession.run wraps power_cycle + handshake + DDR-init in a retry loop (default 2 attempts) when programmatic power control is available. Past-DDR failures are not retried since they're slow and rarely transient. Without power control, retries are pointless and disabled. 11 regression tests in test_handshake_resilience.py cover all three fixes — DDR-step error attribution, drain_until_silent quiet-period / max-wait / idle-line behavior, and session retry behavior across all relevant cases (transient handshake, post-DDR failure, no power control, retries exhausted). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
defib install previously set ``mtdparts`` and ``bootcmd`` but relied on
U-Boot's compiled-in default ``bootargs``. Recent OpenIPC U-Boot builds
default to:
root=/dev/ubiblock0_0 rootfstype=squashfs ubi.block=0,0 init=/init
Even when the actual ``rootfs.ubi.hi3516av200`` from the same release
contains UBIFS. Result: kernel boots the wrong filesystem driver and
panics with "Unable to mount root fs ... tried squashfs".
Fix: defib now ``setenv bootargs`` to match the rootfs format it just
wrote — UBIFS bootargs when the rootfs file is a UBI image (extracted
to UBIFS), squashfs+ubiblock bootargs otherwise. Bootargs string built
in a small ``_nand_bootargs`` helper for unit-testability.
Tested on hi3516av200:
- Kernel command line now: root=ubi0:rootfs rootfstype=ubifs ...
- UBIFS mounts, init runs, dropbear/syslog start, login prompt reached
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two real-hardware bugs fixed together (both verified on hi3516av200):
1. Flaky boot ROM handshake → reliable retry + better diagnostics
The transient "Failed to send DDR step" on first install (succeeds on retry with no code change) is fixed by three changes:
2. `defib install` set wrong bootargs → kernel panic on boot
Recent OpenIPC U-Boot defaults to `rootfstype=squashfs` even when `rootfs.ubi` from the same release contains UBIFS, causing kernel panic ("Unable to mount root fs ... tried squashfs"). Fix: defib now `setenv bootargs` to match the rootfs format it just wrote (UBIFS for UBI images, squashfs+ubiblock otherwise) instead of trusting U-Boot's compiled-in default. Bootargs string built in a small `_nand_bootargs` helper for unit-testability.
Test plan
🤖 Generated with Claude Code