Skip to content

ENH: Resample Image Geometry uses much less memory at the cost of some computation time#1529

Merged
imikejackson merged 5 commits intoBlueQuartzSoftware:developfrom
imikejackson:topic/resample_image_geometry
Feb 21, 2026
Merged

ENH: Resample Image Geometry uses much less memory at the cost of some computation time#1529
imikejackson merged 5 commits intoBlueQuartzSoftware:developfrom
imikejackson:topic/resample_image_geometry

Conversation

@imikejackson
Copy link
Contributor

The code was updated to remove the pre-calculated voxel indices which added an 8 byte integer at every voxel which can quickly blow up memory usage. The code now uses some extra cpu cycles to compute the source and destination indices.

@imikejackson imikejackson linked an issue Feb 13, 2026 that may be closed by this pull request
@imikejackson imikejackson force-pushed the topic/resample_image_geometry branch 2 times, most recently from ab3332f to 1b22dfd Compare February 17, 2026 21:39
imikejackson and others added 4 commits February 20, 2026 17:07
Uses ProgressMessageHelper with shared atomic counter to report
percentage complete for each DataArray during the resampling loop.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
We cannot parallelize effectively over each array without allocating 8 byte indices
or other large buffers so go back to at copying each Data Array on its own thread.
@imikejackson imikejackson force-pushed the topic/resample_image_geometry branch from 1b22dfd to 8737444 Compare February 20, 2026 22:07
@imikejackson
Copy link
Contributor Author

PR #1529 Review: "Resample Image Geometry uses much less memory at the cost of some computation time"

Author: imikejackson | Files changed: 2 (ResampleImageGeom.cpp, ResampleImageGeom.hpp) | +81 / -62 lines

What it does

Rewrites the ResampleImageGeom algorithm to eliminate a pre-computed std::vector<int64> that mapped every destination voxel to its source voxel index. For a 1000x1000x1000 geometry, this vector alone consumed ~8 GB of RAM. The new approach dynamically computes source-to-destination mapping per voxel using ImageGeom::getPlaneCoords() and ImageGeom::getIndex().

Old flow: One parallel pass to pre-compute all indices into a vector, then N parallel copy passes (one per data array) using CopyTupleUsingIndexList.

New flow: N parallel tasks (one per data array), each doing its own per-voxel coordinate lookup + copy via ResampleImageGeomArrayImpl.

Positive observations

  • Significant memory reduction - eliminates destDims[0]*destDims[1]*destDims[2] * 8 bytes allocation
  • Better origin handling - old code assumed same origin for source/destination (raw x / origSpacing); new code correctly accounts for different origins via getPlaneCoords (adds origin) and getIndex (subtracts origin)
  • Graceful out-of-bounds handling - when getIndex returns std::nullopt, the destination tuple is zeroed out via fillTuple(destVoxelIdx, 0), which is a reasonable default
  • Array-level parallelism via ParallelTaskAlgorithm mitigates the extra computation on multi-core systems
  • Thread-safe progress reporting with mutex-guarded ThrottledMessenger

Issues found

1. Misleading comment (minor)

Line ~58 in the new code:

// Get the destination voxel center.
Point3D<float64> coords = m_DestImageGeom.getPlaneCoords(destVoxelIdx);

getPlaneCoords returns the voxel corner (min corner / node position): column * spacing + origin. The actual center method is getCoordsf / getCoords, which adds + 0.5 * spacing. The behavior is consistent with the old code (which also used corners), but the comment is inaccurate.

2. Unchecked copyFrom return value (medium)

destDataStore.copyFrom(destVoxelIdx, srcDataStore, srcIndex.value(), 1);

AbstractDataStore::copyFrom returns Result<> which can contain errors (out-of-range index, component mismatch). The return value is silently discarded. While the indices should always be valid here given the geometry constraints, silently ignoring errors could mask bugs during future refactoring.

3. Unused member variable m_InitialPoint (minor)

std::chrono::steady_clock::time_point m_InitialPoint = std::chrono::steady_clock::now();

This member is declared and initialized in ResampleImageGeom.hpp but never referenced anywhere. Dead code.

4. Raw pointer to stack-local ThrottledMessenger (design note)

ThrottledMessenger throttledMessenger = messageHelper.createThrottledMessenger();
m_ThrottledMessengerPtr = &throttledMessenger;

This stores a raw pointer to a local variable in operator(). It's safe today because taskRunner.wait() ensures all tasks complete before operator() returns, but it's fragile -- any future refactoring that changes the lifetime of throttledMessenger could introduce a dangling pointer. Consider making it a std::unique_ptr member or documenting the lifetime dependency.

5. Typo in comment (trivial)

taskRunner.wait(); // This will spill over if the number of geometries to processes does not divide evenly by the number of threads.

"processes" should be "process"

Performance trade-off consideration

The old approach did 1 parallel pass to pre-compute indices, then N copy passes that were simple array-index lookups. The new approach does N passes, each with per-voxel getPlaneCoords (modulo + division + multiply) and getIndex (subtraction + division + floor + bounds checks). For a geometry with many cell data arrays, total coordinate computation work scales as numArrays * numVoxels instead of 1 * numVoxels. The array-level parallelism helps, but on a system with fewer cores than arrays, this is a real slowdown. The commit message acknowledges this trade-off, which is good -- the memory savings are well worth it for large geometries.

Summary

The PR accomplishes its stated goal of reducing memory consumption. The core logic is correct and the behavior is consistent with the old code. The issues found are all minor to medium severity -- the unchecked copyFrom return and the misleading "center" comment are the most actionable items.

@imikejackson imikejackson enabled auto-merge (squash) February 20, 2026 22:29
@imikejackson imikejackson merged commit 9057f4a into BlueQuartzSoftware:develop Feb 21, 2026
6 checks passed
@imikejackson imikejackson deleted the topic/resample_image_geometry branch February 23, 2026 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Resample Data (Image Geometry) Takes massive amounts of memory

2 participants