Execute delayed lowering in a sandbox#155337
Execute delayed lowering in a sandbox#155337aerooneqq wants to merge 10 commits intorust-lang:mainfrom
Conversation
This comment has been minimized.
This comment has been minimized.
9028f44 to
297f41a
Compare
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Execute delayed lowering in a sandbox
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (65d1474): comparison URL. Overall result: ❌ regressions - please read:Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf. Next, please: If you can, justify the regressions found in this try perf run in writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 7.7%, secondary 4.8%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 2.2%, secondary -0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis perf run didn't have relevant results for this metric. Bootstrap: 489.055s -> 495.053s (1.23%) |
This comment has been minimized.
This comment has been minimized.
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Execute delayed lowering in a sandbox
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (244e060): comparison URL. Overall result: ❌✅ regressions and improvements - please read:Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf. Next, please: If you can, justify the regressions found in this try perf run in writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 2.5%, secondary 3.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 0.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis perf run didn't have relevant results for this metric. Bootstrap: 490.756s -> 493.643s (0.59%) |
Code is not yet ready to review.
Motivation
This PR researches the possibility to execute delayed lowering in a sandbox: a special "untracked" mode of the compiler which allows to correctly obtain all information (which is the result of ty-level queries execution) for lowering of delayed owners.
Without the sandbox we will eventually run into query cycles which results in non-relevant error messages, and what's more important we simply can't execute needed workflows, i.e., resolving inherent methods in delegations, for this resolution we need
hir_crate_itemsto complete, however if we insert ids of delegations intohir_crate_itemsthen we will certainly run into query cycle for example when computingassociated_itemwhich requires to fillhas_selffield which in turn requires the resolution of currently lowered delegation. We may try not to include ids of delayed owners intohir_crate_items, however that will break all compilation as we can't update the result of the query.This problem is solved by a sandbox: the core idea that we execute some code to lower delayed owners, then we restore the state of the compiler as it was before sandbox execution, and continue compilation but with correctly lowered delayed owners. During sandbox we may fill queries like
hir_crate_itemswith invalid (from without sandbox compilation point of view) values and execute lowering in a "forget that delayed owners exist in the codebase" mode. This should help to eliminate query cycles from delayed owners lowering.Implementation details
Queries caches
The first thing to think about are query caches: when sandbox execution is over we need to invalidate caches that were filled during sandbox. The approach that is implemented in this PR is simply dropping all keys that were filled during sandbox, for this purpose we remember keys that queries had before sandbox, and after it we drop all keys that are not present in this remembered set.
However, this approach is not optimal, as some queries may contain valid values for keys filled during sandbox, consider
generics_of, to obtain generics of some function we do not need to interact with unit queries that were filled with invalid values likehir_crate_items, so we should not invalidate it for performance reasons (at lest in non-incremental mode).The invalidation approach should allow skipping modifications of almost all caches, thus it should not affect stable part of the compiler, as we invalidate them only if sandbox was executed.
Untracked
In addition to query caches we have untracked data, the most obvious example of which are definitions that are created during resolve or AST -> HIR lowering stage (or even later). The problem here is that untracked data can be modified during sandbox execution and in theory we also need to invalidate it. Why? Because definitions created during sandbox can be recreated during after-sandbox stages of compilation (i.e., impl Traits) and this will cause a correct panic.
One way of solving it as it was mentioned above is invalidation of all definitions that were created during sandbox, BUT definitions that were created during delayed AST -> HIR lowering are valid and they will not be recreated after sandbox. So we need to preserve them. Taking into account those factors now the implementation is as follows:
allow_overwriteset,allow_overwritebut only once,allow_overwriteis not considered, as during sandbox we may also try to create same definitions, in this case we need to panic.DepGraph
We can compile the program in incremental mode that involves creation of
DepNodes.DepNodecan not be created twice, and by the nature of sandbox (it can be thought just as regular non-tracked computation which reuses some of the compiler's logic and infrastructure) we should not in any sense interact with dependencies. That is why we pretend that there is noDepGraphduring sandbox and execute all queries in non-incremental mode. When sandbox is over we continue compilation with theDepGraphas usual.Integration of delayed lowering into the query system
There is a separate problem of integrating sandboxed delayed lowering in the compiler flow for the following reason: we must drop
ResolverAstLoweringandast::Crateafter we lowered all AST owners, otherwise there will be perf regressions, as the users of compiler API implicitly rely on this. This fact blocks us from creatingforce_delayed_owners_loweringfunction and calling it in the beginning ofanalysis(())query, as in rustdoc analysis is not called.The obvious option is to call
force_delayed_owners_loweringin the beginning of thehir_crate_itemsquery, and that is how it is implemented now, but if we invoke sandboxed lowering fromhir_crate_items, which requireshir_crate_itemsto be run before lowering (see second pseudocode in the end), we will encounter a query cycle.callfront(it is like opposition forcallback) which will be executed by the query infrastructure before the selected query (in our casehir_crate_items). With this approach user of the compiler API do not have to manually drop AST and as far as I understand this approach will work even if thequery_execute_fnis overridden.TyCtxt untracked caches
There are some untracked caches in TyCtxt itself (i.e.,
clauses_cache,evaluation_cache), we just clear them after sandbox, I did not dive too deep in researching a question whether we can not invalidate some of them, that's a future work.Trimmed def paths
There is an assert in trimmed def paths query that ensures that it is executed only once so it is turned off in a sandbox.
Parallel frontend interaction
Now sandbox is executed in a place where no other queries are run, so synchronization issues were not considered. Basically it is a continuation of
hir_cratewhere we finish AST -> HIR lowering, all other queries requirehir_crate(()).Diagnostics
During sandbox we should not emit anything (i.e.,
tcx.dcx().make_silent()), maybe the only exception are delayed-owners-related diagnostics, that is also a future work.Pseusocode
Here is an approximate routine for execution of an operation in a sandbox.
And that's what we want to execute:
Why sandbox?
hir_crate_itemsbefore delayed lowering without sandbox in delegation: fix cycles during delayed lowering #154368. It means thathir_crate_itemscontained ids of delayed owners and it did not solved the problem: delegation: OOM #155060 was immediately reported. It happened because during trimmed def paths the identifier is obtained through accessing HIR of items, and as we are now lowering delayed owner this causes a cycle. In earlier versions of delegation: fix cycles during delayed lowering #154368 there was an attempt to store information about delayed owners before delayed lowering and then access it bypassing HIR. However this approach leads to manually insertingif tcx.is_delayed_owner(def_id)in possibly many queries. So when new features are added to the compiler author or those features should know about delayed owners and think how to correctly obtain needed data. That is not good approach. Sandbox solves those issues by removing the ids of delayed owners fromhir_crate_itemsandhir_module_items(and thus hopefully from all compilation), moreover we may adjustrustc_hir::intravisit::Visitorto not to visit delayed owners (now this check is only inItemCollector)ProbeContextwhich requirescrate_inherent_implsto run, which in turn requireshir_crate_items. So withouthir_crate_itemswe can't perform inh. methods resolution but now we force delayed lowering beforehir_crate_items. In this case there are only two options: either to executehir_crate_itemsas in delegation: fix cycles during delayed lowering #154368 which is not scalable approach or to create a sandbox. Note that inh. methods resolution requires collecting candidates which in turn requires executingassociated_itemquery, andAssocItemstruct has ahas_selffield that requires HIR of delayed owner. So we can not execute resolution of inh. methods without settinghir_crate_itemsand in this aspect sandbox helps us a lot: ashir_crate_itemswill be available and will not contain ids of delayed owners.Honestly speaking, the algorithm for inh. methods resolution will possibly be iterative: as an input you have a set of default HIR (called
default) and a set of delayed owners (calleddelayed), next you execute a loop: if there there are some resolved elements fromdelayedagainstdefaultyou replacedefaultwith those resolved elements and continue until!delayed.is_empty(). If at some iteration there are no resolutions anddelayedis not empty then there are cycle in recursive delegations. With such an iterative algorithm sandbox should help as we may updatehir_crate_itemson every iteration,try_queryfunction that can returnNoneif there is a query cycle. It may help with lowering of delegations without inh. methods, but once again it is not guaranteed as query cycles may appear some time after when someone who is not aware of delayed owners will implement his features,r? @petrochenkov