Skip to content

Parallel overlap checking with staged workflow#20963

Merged
agheata merged 20 commits intoroot-project:masterfrom
agheata:par_ovlp
Jan 27, 2026
Merged

Parallel overlap checking with staged workflow#20963
agheata merged 20 commits intoroot-project:masterfrom
agheata:par_ovlp

Conversation

@agheata
Copy link
Member

@agheata agheata commented Jan 20, 2026

This Pull request:

Refactors and accelerates geometry overlap checking by splitting the procedure into well-defined stages, introducing candidate filtering and parallel execution, and fixing long-standing meshing and caching issues affecting correctness and performance. Preserves backward-compatibility of interfaces.

Changes or fixes:

  • Three-stage overlap checking workflow
image
  1. Candidate enumeration
    • Traverses the geometry tree.
    • Uses oriented bounding box (OBB) tests to reject non-overlapping volume pairs early. This gives in general a sizeable candidate reduction compared to the legacy version, and a massive one for assembly-dominated setups (a factor of ~1000 for ALICE O² geometry)
  2. Surface point generation and caching
    • Generates surface sampling points per shape using the values set via TGeoManager::SetPoints() (default 20) and TGeoManager::SetNmeshVertices() (default 1000).
    • Introduces a thread-safe per-shape mesh cache inside TGeoChecker.
    • Cache invalidation is correctly handled when the number of mesh points or mesh segments changes
    • Fixes incorrect behavior caused by:
      • reuse of static TBuffer3D buffers
      • stale mesh caches in TGeoCompositeShape / TGeoBoolNode
      • inconsistent GetPointsOnSegments() contracts in several shapes
  3. Exact overlap checking following check points density
    • Performs navigation-based overlap tests using cached surface points.
    • This stage is now parallelized using ROOT Implicit MT.
  • Parallel execution

    • Only the final overlap checking stage is parallelized (safe and scalable).
    • Uses ROOT::TThreadExecutor and respects ROOT::EnableImplicitMT(N).
    • Demonstrates good strong scaling on realistic detector geometries.
  • Correctness fixes in meshing

    • Reworked GetPointsOnSegments() for several shapes to:
      • avoid dependence on TBuffer3D static storage
      • generate points only on valid surface generators
      • respect a strict contract: return true only if exactly npoints are filled
    • Fixed incorrect assumptions in legacy implementations (e.g. TGeoBBox, TGeoTube, TGeoCone, TGeoPcon, TGeoCtub).
  • Performance impact

    • The check is more thorough (points more uniformly spread on solid surfaces) and typically much faster, because it reduces the number of candidates for the expensive check, and parallelizes this most expensive part.
    • On typical setups (e.g. cmsRun4D116), the gain using 32 threads is significant
c_strong_scaling_cms
  • On assembly-rich setups (e.g. o2sim_geometry) the gain is massive (from ~4 hours to ~15 sec on the same machine)
  • Below is a strong scaling plot showing both the total time and the time for the parallelized checking stage for ALICE. Note that this is not the typical profile, it is shown mainly to show the scaling of the parallelized stage. For most setups, the weight of the sequential stages 1 and 2 is much smaller that the weight of the parallel stage 3.
c_strong_scaling

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

This PR was done at the request of ALICE experiment, but provides setup-independent enhancements of the overlap checking feature

agheata and others added 15 commits January 20, 2026 10:42
Voxels marked as needing rebuild are built lazily, but this is a thread unsafe operation as it touches shared data. This introduces a method TGeoManager::RebuildVoxels that is called before checking overlaps, removing rebuilding potentially happening during the checking. The method TGeoChecker::CheckOverlaps is rewritten to avoid changing the list of overlaps attached to nodes, which was also a thread unsafe construct.
…;lelizable compute phase.

Checking overlaps by sampling points in volumes implemented now as CheckOverlapsBySampling, and marked as deprecated. In future we will support a surface-based sampling like in Geant4.
…y checks.

This reduces by a large factor the number of real checks to be done. In assembly-based setups such as ALICE, the factor is ~1000.
…d added new ones.

The legacy implementations for solids did not honour the contract that in case a shape overrides this method, it has to provide exactly the requested number of points if it returns true. The legacy versions even wrote more than the requested points, overwriting memory.

Fixes cache overriding of points by visualization, that made the point generation not deterministic.
…site shapes.

Previously, mesh points set via TGeoCompositeShape::SetPoints were cached only at the first call, making mesh generation invariant to the number of segments requested via TGeoManager. Now the caches are invalidated and rebuild lazily whenever the number of segment changes in the manager.
@agheata
Copy link
Member Author

agheata commented Jan 20, 2026

Working on fixing no-IMT mode

@github-actions
Copy link

github-actions bot commented Jan 20, 2026

Test Results

    22 files      22 suites   3d 15h 36m 6s ⏱️
 3 771 tests  3 771 ✅ 0 💤 0 ❌
75 898 runs  75 898 ✅ 0 💤 0 ❌

Results for commit cf4af72.

♻️ This comment has been updated with latest results.

@agheata agheata marked this pull request as draft January 22, 2026 09:38
This was needed because the test is run in the same process as test_material_units, which changes the statically-initialized unit system in TGeoManager, inducing wrong material interpretation in GDML import used in the current test. This is not normal and should be debugged separately, but forcing the ROOT units seems to fix the problem

Split the geometry tests to avoid static interference
ipoints--;
nperseg = remaining;

const Double_t dx = (p1[0] - p0[0]) / (nperseg + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is performance critical, we could avoid 3 divisions and just have one

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not, but I agree it can be optimized

continue;

const Double_t phi = ig * dphi;
const Double_t c = TMath::Cos(phi);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opportunity to save cycles with sincos.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are many places where this can be done

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why sincos is not available in TMath?


// Center of the φ bin => never exactly on the cut planes
const Double_t phi = phi1 + (ig + 0.5) * dphi;
const Double_t c = TMath::Cos(phi);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sincos opportunity

if (dz == 0.0)
continue;
const Double_t len = std::abs(dz);
zsegs.push_back({i, z0, z1, len});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emplace?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed in general for complex/heavy constructors, but nowadays for trivial structs push_back and emplace_back generate the same code, the compiler moves the data from the temporary to the vector memory (POD copy), inlining and eliding aggressively, so no temporaries survive codegen


// Generator center angle: avoids phi endpoints
const Double_t phi = (phi1Deg + (ig + 0.5) * dphi) * TMath::DegToRad();
const Double_t c = TMath::Cos(phi);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sincos opportunity?

Int_t outPoints = npoints;
Int_t inPoints = 0;
if (hasInner) {
outPoints = (npoints + 1) / 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opportunity for one division less?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean: outPoints = (npoints + 1) >> 1 ? If yes, I was doing this agressively at some point, until I discovered that it is worse: it hides the intention (looks more like a hack, not math), on signed integers it is wrong, and the compiler eventually generates the same code. So n / 2 lets the compiler take the decision on the implementation

const Int_t rem = nSurfPoints % nGen;

// Use generator centers: never exactly phi1 or phi2
const Double_t dphi = dPhiTot / (Double_t)nGen;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to cast to double

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I'm a bit maniac due to the work on mixed precision in VecGeom. Harmless here.

continue;

const Double_t phi = phi1 + (ig + 0.5) * dphi; // avoid endpoints
const Double_t c = TMath::Cos(phi);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sincos

Copy link
Member Author

@agheata agheata Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that sincos is a non-standard extension that exists on most platforms, but not on all (such as mac/windows flavors), so IMHO TMath should (it does not) provide a safe wrapper for it that falls back on regilar Sin/Cos where not available). Or maybe Iam wrong?


const Double_t phi = phi1 + (ig + 0.5) * dphi; // avoid endpoints
const Double_t c = TMath::Cos(phi);
const Double_t s = TMath::Sin(phi);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sincos

#include "TMap.h"
#include "TFile.h"
#include "TKey.h"
#include <TString.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stl and root's include order

void TGeoVolume::SetFWExtension(TGeoExtension *ext)
{
TGeoExtension* tmp = fFWExtension;
TGeoExtension *tmp = fFWExtension;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use of this line?

#define ROOT_TGeoChecker

#include "TVirtualGeoChecker.h"
#include "TGeoOverlapCandidate.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stl - ROOT includes order

UInt_t id;
Int_t level;
// Check extrusion only for daughters of a non-assembly volume
Int_t ncand = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please solve these warnings

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I miss some warnings?

@@ -1,6 +1,7 @@
#include <gtest/gtest.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this test enough? Maybe there are more cases to check and that would be very useful!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, we can have a more extended check, but the thing with overlap checking is that the results may change if for some reason, the points generated on the shape meshes change. Overlap checking only strives not to give false positives, but may miss. So this case is exactly doing this: knows that the shapes mother/daughter have some common faces, but this is correct, so no extrusion should be reported. Actually this unit test triggered the error on Windows, that was a real bug, and also other errors when changing some global static and having 2 geometry managers in the same process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants