Parallel overlap checking with staged workflow#20963
Parallel overlap checking with staged workflow#20963agheata merged 20 commits intoroot-project:masterfrom
Conversation
Voxels marked as needing rebuild are built lazily, but this is a thread unsafe operation as it touches shared data. This introduces a method TGeoManager::RebuildVoxels that is called before checking overlaps, removing rebuilding potentially happening during the checking. The method TGeoChecker::CheckOverlaps is rewritten to avoid changing the list of overlaps attached to nodes, which was also a thread unsafe construct.
…;lelizable compute phase. Checking overlaps by sampling points in volumes implemented now as CheckOverlapsBySampling, and marked as deprecated. In future we will support a surface-based sampling like in Geant4.
…sing the separation axis theorem (SAT).
…y checks. This reduces by a large factor the number of real checks to be done. In assembly-based setups such as ALICE, the factor is ~1000.
…d added new ones. The legacy implementations for solids did not honour the contract that in case a shape overrides this method, it has to provide exactly the requested number of points if it returns true. The legacy versions even wrote more than the requested points, overwriting memory. Fixes cache overriding of points by visualization, that made the point generation not deterministic.
…site shapes. Previously, mesh points set via TGeoCompositeShape::SetPoints were cached only at the first call, making mesh generation invariant to the number of segments requested via TGeoManager. Now the caches are invalidated and rebuild lazily whenever the number of segment changes in the manager.
…cker implementation file
|
Working on fixing no-IMT mode |
Test Results 22 files 22 suites 3d 15h 36m 6s ⏱️ Results for commit cf4af72. ♻️ This comment has been updated with latest results. |
This was needed because the test is run in the same process as test_material_units, which changes the statically-initialized unit system in TGeoManager, inducing wrong material interpretation in GDML import used in the current test. This is not normal and should be debugged separately, but forcing the ROOT units seems to fix the problem Split the geometry tests to avoid static interference
| ipoints--; | ||
| nperseg = remaining; | ||
|
|
||
| const Double_t dx = (p1[0] - p0[0]) / (nperseg + 1); |
There was a problem hiding this comment.
if this is performance critical, we could avoid 3 divisions and just have one
There was a problem hiding this comment.
It is not, but I agree it can be optimized
| continue; | ||
|
|
||
| const Double_t phi = ig * dphi; | ||
| const Double_t c = TMath::Cos(phi); |
There was a problem hiding this comment.
opportunity to save cycles with sincos.
There was a problem hiding this comment.
Yes, there are many places where this can be done
There was a problem hiding this comment.
I wonder why sincos is not available in TMath?
|
|
||
| // Center of the φ bin => never exactly on the cut planes | ||
| const Double_t phi = phi1 + (ig + 0.5) * dphi; | ||
| const Double_t c = TMath::Cos(phi); |
| if (dz == 0.0) | ||
| continue; | ||
| const Double_t len = std::abs(dz); | ||
| zsegs.push_back({i, z0, z1, len}); |
There was a problem hiding this comment.
Agreed in general for complex/heavy constructors, but nowadays for trivial structs push_back and emplace_back generate the same code, the compiler moves the data from the temporary to the vector memory (POD copy), inlining and eliding aggressively, so no temporaries survive codegen
|
|
||
| // Generator center angle: avoids phi endpoints | ||
| const Double_t phi = (phi1Deg + (ig + 0.5) * dphi) * TMath::DegToRad(); | ||
| const Double_t c = TMath::Cos(phi); |
| Int_t outPoints = npoints; | ||
| Int_t inPoints = 0; | ||
| if (hasInner) { | ||
| outPoints = (npoints + 1) / 2; |
There was a problem hiding this comment.
opportunity for one division less?
There was a problem hiding this comment.
you mean: outPoints = (npoints + 1) >> 1 ? If yes, I was doing this agressively at some point, until I discovered that it is worse: it hides the intention (looks more like a hack, not math), on signed integers it is wrong, and the compiler eventually generates the same code. So n / 2 lets the compiler take the decision on the implementation
| const Int_t rem = nSurfPoints % nGen; | ||
|
|
||
| // Use generator centers: never exactly phi1 or phi2 | ||
| const Double_t dphi = dPhiTot / (Double_t)nGen; |
There was a problem hiding this comment.
Agreed, I'm a bit maniac due to the work on mixed precision in VecGeom. Harmless here.
| continue; | ||
|
|
||
| const Double_t phi = phi1 + (ig + 0.5) * dphi; // avoid endpoints | ||
| const Double_t c = TMath::Cos(phi); |
There was a problem hiding this comment.
The problem is that sincos is a non-standard extension that exists on most platforms, but not on all (such as mac/windows flavors), so IMHO TMath should (it does not) provide a safe wrapper for it that falls back on regilar Sin/Cos where not available). Or maybe Iam wrong?
|
|
||
| const Double_t phi = phi1 + (ig + 0.5) * dphi; // avoid endpoints | ||
| const Double_t c = TMath::Cos(phi); | ||
| const Double_t s = TMath::Sin(phi); |
| #include "TMap.h" | ||
| #include "TFile.h" | ||
| #include "TKey.h" | ||
| #include <TString.h> |
| void TGeoVolume::SetFWExtension(TGeoExtension *ext) | ||
| { | ||
| TGeoExtension* tmp = fFWExtension; | ||
| TGeoExtension *tmp = fFWExtension; |
| #define ROOT_TGeoChecker | ||
|
|
||
| #include "TVirtualGeoChecker.h" | ||
| #include "TGeoOverlapCandidate.h" |
| UInt_t id; | ||
| Int_t level; | ||
| // Check extrusion only for daughters of a non-assembly volume | ||
| Int_t ncand = 0; |
There was a problem hiding this comment.
Did I miss some warnings?
| @@ -1,6 +1,7 @@ | |||
| #include <gtest/gtest.h> | |||
There was a problem hiding this comment.
is this test enough? Maybe there are more cases to check and that would be very useful!
There was a problem hiding this comment.
Indeed, we can have a more extended check, but the thing with overlap checking is that the results may change if for some reason, the points generated on the shape meshes change. Overlap checking only strives not to give false positives, but may miss. So this case is exactly doing this: knows that the shapes mother/daughter have some common faces, but this is correct, so no extrusion should be reported. Actually this unit test triggered the error on Windows, that was a real bug, and also other errors when changing some global static and having 2 geometry managers in the same process.
This Pull request:
Refactors and accelerates geometry overlap checking by splitting the procedure into well-defined stages, introducing candidate filtering and parallel execution, and fixing long-standing meshing and caching issues affecting correctness and performance. Preserves backward-compatibility of interfaces.
Changes or fixes:
TGeoManager::SetPoints()(default 20) andTGeoManager::SetNmeshVertices()(default 1000).TGeoChecker.TBuffer3DbuffersTGeoCompositeShape / TGeoBoolNodeGetPointsOnSegments()contracts in several shapesParallel execution
ROOT::TThreadExecutorand respectsROOT::EnableImplicitMT(N).Correctness fixes in meshing
GetPointsOnSegments()for several shapes to:trueonly if exactlynpointsare filledTGeoBBox,TGeoTube,TGeoCone,TGeoPcon,TGeoCtub).Performance impact
Checklist:
This PR was done at the request of ALICE experiment, but provides setup-independent enhancements of the overlap checking feature