GVN: invalidate moved-from droppable locals on Operand::Move#155296
Open
blackms wants to merge 1 commit intorust-lang:mainfrom
Open
GVN: invalidate moved-from droppable locals on Operand::Move#155296blackms wants to merge 1 commit intorust-lang:mainfrom
blackms wants to merge 1 commit intorust-lang:mainfrom
Conversation
`simplify_operand` was treating `Operand::Move(p)` and `Operand::Copy(p)`
identically and leaving the recorded `VnIndex` of `p.local` in place even after
the value had been consumed. A later aggregate rvalue with the same `VnIndex`
was then rewritten to `Operand::Copy(p.local)` by `visit_assign`, and
`StorageRemover` downgraded the original `Move` to a `Copy` as well, turning a
single move into a double-use of memory the callee had already dropped.
The reproducer is a pair of `FnOnce`-consumed `(Vec::new(),)` tuples, both of
which receive the same aggregate `VnIndex` because `Vec::new()` lowers to a
deterministic aggregate; GVN then made the second `call_once(move _7)` into
`call_once(copy _5)` after the first call had already taken ownership of `_5`.
This patch records `simplify_operand`'s `Move` branch separately and, when the
moved place's type `needs_drop`, drops everything GVN knows about the backing
local via a new `invalidate_local` helper. Calls reach this path through
`super_terminator` -> `visit_operand`, so terminator-arg moves are covered with
no extra plumbing. `!needs_drop` types are left untouched: for them `Move` and
`Copy` are observationally equivalent in post-borrowck MIR, so the existing
unification (e.g. moves of `&mut T`, `*mut T`, plain integer aggregates) is
preserved.
Three pre-existing mir-opt tests lose the unsound aggregate-to-copy rewrite as
a consequence:
- `tests/mir-opt/pre-codegen/try_identity.rs` (`old` desugaring): the
`Ok(x?)` identity is no longer collapsed into `_0 = copy _1` for the
generic `Result<T, E>`. The `new` desugaring is unaffected.
- `tests/mir-opt/gvn_copy_aggregate.rs::remove_storage_dead`: the rebuilt
`AlwaysSome::<T>::Some(_)` is no longer rewritten to `_0 = copy _3`.
- `tests/mir-opt/early_otherwise_branch_unwind.rs`: `EarlyOtherwiseBranch`
now emits a fresh `discriminant` read on the cleanup edge instead of
reusing the cached one.
The new tests pin the fix down:
- `tests/ui/mir/gvn-fnonce-miscompile.rs` runs the issue's MCVE under
`-Copt-level=3 -Zmir-enable-passes=+GVN` and asserts the second closure
sees a fresh, empty `Vec`.
- `tests/mir-opt/gvn_move_invalidation.rs` exercises the post-fix MIR shape
for both the droppable (no rewrite) and plain (constant-fold) cases.
Collaborator
|
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #155241 (the MIR-GVN half of the issue; the
extern "rust-call"const-arg half is handled in #155291).Problem
simplify_operandwas treatingOperand::Move(p)andOperand::Copy(p)identically and leaving the recordedVnIndexofp.localin place even after the value had been consumed. A later aggregate rvalue with the sameVnIndexwas then rewritten toOperand::Copy(p.local)byvisit_assign, andStorageRemoverdowngraded the originalMoveto aCopyas well, turning a single move into a double-use of memory the callee had already dropped.The reproducer in the issue is a pair of
FnOnce-consumed(Vec::new(),)tuples, both of which receive the same aggregateVnIndexbecauseVec::new()lowers to a deterministicAggregatervalue. GVN then made the secondcall_once(move _7)intocall_once(copy _5)after the first call had already taken ownership of_5, leading to either anunreachable!()panic or heap corruption (SIGABRT on stable 1.94.1) at-Copt-level=2+.Fix
This patch records
simplify_operand'sMovebranch separately and, when the moved place's typeneeds_drop, drops everything GVN knows about the backing local via a newinvalidate_localhelper. Calls reach this path throughsuper_terminator -> visit_operand, so terminator-arg moves are covered with no extra plumbing.!needs_droptypes are intentionally left alone: for themMoveandCopyare observationally equivalent in the post-borrowck MIR GVN runs on, so the existing unification (e.g. moves of&mut T,*mut T, plain integer aggregates) is preserved.Soundness sketch
After
_5 = move _xGVN no longer believes_5holds a re-derivable value for any locally-trackedVnIndex, so it cannot rewrite a future_y = some_aggregate_with_same_vninto_y = copy _5. The invalidation is gated onneeds_dropbecause that is precisely when the post-move source is left in an "unspecified state": the callee may drop / move-out of the slot, and the original drop glue for_5may have been elided by drop elaboration (which ran before GVN).The narrower alternative — invalidate only at
TerminatorKind::Callarguments — was considered and rejected: the same unsoundness arises within a single basic block when a partial move (_x = move _5.field) is followed by another aggregate with the sameVnIndexas_5's, andStorageRemovercan no longer rescue that pattern.Test plan
run-passtesttests/ui/mir/gvn-fnonce-miscompile.rsruns the MCVE under-Copt-level=3 -Zmir-enable-passes=+GVNand asserts the second closure sees a fresh, emptyVec.tests/mir-opt/gvn_move_invalidation.rsexercises the post-fix MIR shape for both the droppable (no rewrite) and plain (constant-fold) cases.tests/mir-opt/pre-codegen/try_identity.rs(olddesugaring): theOk(x?)identity is no longer collapsed into_0 = copy _1for the genericResult<T, E>. Thenewdesugaring is unaffected.tests/mir-opt/gvn_copy_aggregate.rs::remove_storage_dead: the rebuiltAlwaysSome::<T>::Some(_)is no longer rewritten to_0 = copy _3.tests/mir-opt/early_otherwise_branch_unwind.rs:EarlyOtherwiseBranchnow emits a freshdiscriminantread on the cleanup edge instead of reusing the cached one.x test tests/mir-opt,tests/ui/mir,tests/ui/borrowck,tests/ui/closures,tests/ui/drop,tests/codegen-llvm,tests/assembly-llvmall pass onaarch64-apple-darwin --stage 1.x test tests/ui --stage 1made it past 13200/21004 cases with no failures before being interrupted for time.End-to-end verification
rustc 1.94.1:-Copt-level=2exit 134 (SIGABRT).-Copt-level=3 -Zmir-enable-passes=+GVNexit 133.Open questions
The following are genuine soundness / policy questions that survive this fix; flagging them for the mir-opt reviewers rather than trying to address them in this PR:
StorageRemover::visit_operandalso gate itsMove -> Copydowngrade on!needs_drop? This PR stops GVN from amplifying that downgrade into unsoundness, but the downgrade itself remains a footgun for any future pass that adds itself toreused_locals. Probably warrants a separate hardening PR.PassMode::Indirect { mode: ByMutRef, .. }) can be mutated by the callee. Today GVN ignores this entirely. This PR doesn't make that worse, but it's a real soundness gap that should be tracked separately.simplify_aggregate_to_copystill synthesisesOperand::Copy(place)from a place computed bytry_as_placewithout a proper liveness check. Theinvalidate_localmechanism here keeps the rewrite honest from the GVN side, but a principled liveness analysis would let us drop theneeds_dropheuristic and unify even droppable aggregates when it's safe — natural transition path to the "Option B" explored in the investigation.r? mir-opt
LLM disclosure: this PR was bootstrapped by a Claude-based coding agent team that read the GVN pass end-to-end, formed the root-cause hypothesis, and drafted the initial patch and tests. I then validated the work manually: reproduced the bug on system rustc 1.94.1 and on stage0, ran
x testacross mir-opt, ui/mir, ui/borrowck, ui/closures, ui/drop, codegen-llvm, and assembly-llvm onaarch64-apple-darwin --stage 1to confirm there are no regressions beyond the three pre-existing tests documented above, and reviewed the diff line-by-line before pushing. Happy to answer any question about the reasoning or rewrite anything that reads as machine-y.