Skip to content

GVN: invalidate moved-from droppable locals on Operand::Move#155296

Open
blackms wants to merge 1 commit intorust-lang:mainfrom
blackms:fix-gvn-move-invalidation-155241
Open

GVN: invalidate moved-from droppable locals on Operand::Move#155296
blackms wants to merge 1 commit intorust-lang:mainfrom
blackms:fix-gvn-move-invalidation-155241

Conversation

@blackms
Copy link
Copy Markdown

@blackms blackms commented Apr 14, 2026

Fixes #155241 (the MIR-GVN half of the issue; the extern "rust-call" const-arg half is handled in #155291).

Problem

simplify_operand was treating Operand::Move(p) and Operand::Copy(p) identically and leaving the recorded VnIndex of p.local in place even after the value had been consumed. A later aggregate rvalue with the same VnIndex was then rewritten to Operand::Copy(p.local) by visit_assign, and StorageRemover downgraded the original Move to a Copy as well, turning a single move into a double-use of memory the callee had already dropped.

The reproducer in the issue is a pair of FnOnce-consumed (Vec::new(),) tuples, both of which receive the same aggregate VnIndex because Vec::new() lowers to a deterministic Aggregate rvalue. GVN then made the second call_once(move _7) into call_once(copy _5) after the first call had already taken ownership of _5, leading to either an unreachable!() panic or heap corruption (SIGABRT on stable 1.94.1) at -Copt-level=2+.

Fix

This patch records simplify_operand's Move branch separately and, when the moved place's type needs_drop, drops everything GVN knows about the backing local via a new invalidate_local helper. Calls reach this path through super_terminator -> visit_operand, so terminator-arg moves are covered with no extra plumbing. !needs_drop types are intentionally left alone: for them Move and Copy are observationally equivalent in the post-borrowck MIR GVN runs on, so the existing unification (e.g. moves of &mut T, *mut T, plain integer aggregates) is preserved.

Soundness sketch

After _5 = move _x GVN no longer believes _5 holds a re-derivable value for any locally-tracked VnIndex, so it cannot rewrite a future _y = some_aggregate_with_same_vn into _y = copy _5. The invalidation is gated on needs_drop because that is precisely when the post-move source is left in an "unspecified state": the callee may drop / move-out of the slot, and the original drop glue for _5 may have been elided by drop elaboration (which ran before GVN).

The narrower alternative — invalidate only at TerminatorKind::Call arguments — was considered and rejected: the same unsoundness arises within a single basic block when a partial move (_x = move _5.field) is followed by another aggregate with the same VnIndex as _5's, and StorageRemover can no longer rescue that pattern.

Test plan

  • New run-pass test tests/ui/mir/gvn-fnonce-miscompile.rs runs the MCVE under -Copt-level=3 -Zmir-enable-passes=+GVN and asserts the second closure sees a fresh, empty Vec.
  • New mir-opt diff test tests/mir-opt/gvn_move_invalidation.rs exercises the post-fix MIR shape for both the droppable (no rewrite) and plain (constant-fold) cases.
  • Three pre-existing mir-opt tests lose the unsound aggregate-to-copy rewrite as a documented consequence:
    • tests/mir-opt/pre-codegen/try_identity.rs (old desugaring): the Ok(x?) identity is no longer collapsed into _0 = copy _1 for the generic Result<T, E>. The new desugaring is unaffected.
    • tests/mir-opt/gvn_copy_aggregate.rs::remove_storage_dead: the rebuilt AlwaysSome::<T>::Some(_) is no longer rewritten to _0 = copy _3.
    • tests/mir-opt/early_otherwise_branch_unwind.rs: EarlyOtherwiseBranch now emits a fresh discriminant read on the cleanup edge instead of reusing the cached one.

x test tests/mir-opt, tests/ui/mir, tests/ui/borrowck, tests/ui/closures, tests/ui/drop, tests/codegen-llvm, tests/assembly-llvm all pass on aarch64-apple-darwin --stage 1. x test tests/ui --stage 1 made it past 13200/21004 cases with no failures before being interrupted for time.

End-to-end verification

  • Bug reproduces on system rustc 1.94.1: -Copt-level=2 exit 134 (SIGABRT).
  • Bug reproduces on stage0 (unpatched): -Copt-level=3 -Zmir-enable-passes=+GVN exit 133.
  • Bug fixed on stage1 (patched): same command exits 0.

Open questions

The following are genuine soundness / policy questions that survive this fix; flagging them for the mir-opt reviewers rather than trying to address them in this PR:

  1. Should StorageRemover::visit_operand also gate its Move -> Copy downgrade on !needs_drop? This PR stops GVN from amplifying that downgrade into unsoundness, but the downgrade itself remains a footgun for any future pass that adds itself to reused_locals. Probably warrants a separate hardening PR.
  2. Indirect-by-ref Call arguments (PassMode::Indirect { mode: ByMutRef, .. }) can be mutated by the callee. Today GVN ignores this entirely. This PR doesn't make that worse, but it's a real soundness gap that should be tracked separately.
  3. simplify_aggregate_to_copy still synthesises Operand::Copy(place) from a place computed by try_as_place without a proper liveness check. The invalidate_local mechanism here keeps the rewrite honest from the GVN side, but a principled liveness analysis would let us drop the needs_drop heuristic and unify even droppable aggregates when it's safe — natural transition path to the "Option B" explored in the investigation.

r? mir-opt


LLM disclosure: this PR was bootstrapped by a Claude-based coding agent team that read the GVN pass end-to-end, formed the root-cause hypothesis, and drafted the initial patch and tests. I then validated the work manually: reproduced the bug on system rustc 1.94.1 and on stage0, ran x test across mir-opt, ui/mir, ui/borrowck, ui/closures, ui/drop, codegen-llvm, and assembly-llvm on aarch64-apple-darwin --stage 1 to confirm there are no regressions beyond the three pre-existing tests documented above, and reviewed the diff line-by-line before pushing. Happy to answer any question about the reasoning or rewrite anything that reads as machine-y.

`simplify_operand` was treating `Operand::Move(p)` and `Operand::Copy(p)`
identically and leaving the recorded `VnIndex` of `p.local` in place even after
the value had been consumed. A later aggregate rvalue with the same `VnIndex`
was then rewritten to `Operand::Copy(p.local)` by `visit_assign`, and
`StorageRemover` downgraded the original `Move` to a `Copy` as well, turning a
single move into a double-use of memory the callee had already dropped.

The reproducer is a pair of `FnOnce`-consumed `(Vec::new(),)` tuples, both of
which receive the same aggregate `VnIndex` because `Vec::new()` lowers to a
deterministic aggregate; GVN then made the second `call_once(move _7)` into
`call_once(copy _5)` after the first call had already taken ownership of `_5`.

This patch records `simplify_operand`'s `Move` branch separately and, when the
moved place's type `needs_drop`, drops everything GVN knows about the backing
local via a new `invalidate_local` helper. Calls reach this path through
`super_terminator` -> `visit_operand`, so terminator-arg moves are covered with
no extra plumbing. `!needs_drop` types are left untouched: for them `Move` and
`Copy` are observationally equivalent in post-borrowck MIR, so the existing
unification (e.g. moves of `&mut T`, `*mut T`, plain integer aggregates) is
preserved.

Three pre-existing mir-opt tests lose the unsound aggregate-to-copy rewrite as
a consequence:

  - `tests/mir-opt/pre-codegen/try_identity.rs` (`old` desugaring): the
    `Ok(x?)` identity is no longer collapsed into `_0 = copy _1` for the
    generic `Result<T, E>`. The `new` desugaring is unaffected.
  - `tests/mir-opt/gvn_copy_aggregate.rs::remove_storage_dead`: the rebuilt
    `AlwaysSome::<T>::Some(_)` is no longer rewritten to `_0 = copy _3`.
  - `tests/mir-opt/early_otherwise_branch_unwind.rs`: `EarlyOtherwiseBranch`
    now emits a fresh `discriminant` read on the cleanup edge instead of
    reusing the cached one.

The new tests pin the fix down:
  - `tests/ui/mir/gvn-fnonce-miscompile.rs` runs the issue's MCVE under
    `-Copt-level=3 -Zmir-enable-passes=+GVN` and asserts the second closure
    sees a fresh, empty `Vec`.
  - `tests/mir-opt/gvn_move_invalidation.rs` exercises the post-fix MIR shape
    for both the droppable (no rewrite) and plain (constant-fold) cases.
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 14, 2026

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

release-mode heap corruption with non-generic FnOnce(Vec<usize>)

3 participants