Skip to content

ENH: Ingest AnisotropicDiffusionLBR into Modules/Filtering#6093

Merged
hjmjohnson merged 94 commits intoInsightSoftwareConsortium:mainfrom
hjmjohnson:ingest-AnisotropicDiffusionLBR
Apr 23, 2026
Merged

ENH: Ingest AnisotropicDiffusionLBR into Modules/Filtering#6093
hjmjohnson merged 94 commits intoInsightSoftwareConsortium:mainfrom
hjmjohnson:ingest-AnisotropicDiffusionLBR

Conversation

@hjmjohnson
Copy link
Copy Markdown
Member

@hjmjohnson hjmjohnson commented Apr 20, 2026

Moves AnisotropicDiffusionLBR from a configure-time remote fetch into the ITK source tree at Modules/Filtering/AnisotropicDiffusionLBR/ using the v3 whitelist ingestion strategy. Scope is deliberately narrow: only code, headers, tests, wrapping, and module build descriptors cross the merge boundary. Everything else (Old/, examples/, paper/, docs, packaging/CI scaffolding) stays in the archived upstream repo. Relates to #6060.

Strategy: v3 whitelist + CID normalization

Evolved across two feedback rounds in this PR:

  • @blowekamp (2026-04-21): full-history merges risk git-object bloat; Old/, paper material, demo PNGs should not enter ITK.
  • @dzenanz (2026-04-21): full history where clean, squash where messy; migrate all hash-based content-links to .cid; examples belong in top-level Examples/, not inline with module source.

The v1 approach (full-history merge of everything upstream shipped) was abandoned. The v2 bloat-thresholded approach was superseded by the v3 structural whitelist: a git filter-repo --paths pass that admits only the whitelisted categories. Strategy documents live at .claude/worktrees/pr-6061-local-beta-modules-support/ (INGESTION_STRATEGY.md, AUDIT_DESIGN.md, CLEANUP_CHECKLIST.md).

What ingested (whitelist)

The git filter-repo pipeline that produced the merge commit:

git filter-repo
  --path include --path src --path test --path wrapping
  --path CMakeLists.txt --path itk-module.cmake
  --to-subdirectory-filter Modules/Filtering/AnisotropicDiffusionLBR
  --prune-empty always
git filter-repo --invert-paths --path '.../CTestConfig.cmake'  # second pass

Result: 168 upstream commits → 106 surviving on the filtered branch; 43 files transferred; 15 distinct authors preserved; git blame walks across the merge boundary into upstream authors.

What did NOT ingest (see Modules/Filtering/AnisotropicDiffusionLBR/README.md)
  • Old/ — pre-refactor legacy trees
  • examples/ + examples/Data/ — per @dzenanz, per-module examples are routed to top-level Examples/ in a separate follow-up PR, not ingested inline
  • README.rst — useful content is folded into Doxygen on the filter headers
  • CTestConfig.cmake, pyproject.toml, .github/, .clang-format, LICENSE — standalone-repo scaffolding superseded by ITK root equivalents

The new in-tree Modules/Filtering/AnisotropicDiffusionLBR/README.md documents each exclusion with a pointer at the archived upstream URL.

Commit structure (4 commits)
  1. ENH: Ingest ITKAnisotropicDiffusionLBR into Modules/Filtering — merge commit of the whitelist-filtered upstream
  2. DOC: Add README.md pointing at archived upstream for AnisotropicDiffusionLBR — pointer doc + exclusion rationale
  3. COMP: Remove AnisotropicDiffusionLBR.remote.cmake; now in-tree — delete the remote fetch declaration
  4. ENH: Enable AnisotropicDiffusionLBR in CI via configure-ci — opt the module into the Pixi CI matrices
⚠ Pre-merge step: CID normalization pending

Per the v3 strategy, the 24 test-data content-links under test/Baseline/ and test/Input/ should be normalized from .md5 to .cid before this PR merges. The conversion is mechanical — resolve each MD5 hash to its stored bytes via the ExternalData mirror, compute the CIDv1, write <file>.cid, delete <file>.md5.

This step was not executed in the ingest session because the session lacked network access to data.kitware.com and web3.storage. Either:

  • A maintainer with network access runs Utilities/SPDX/ (no — wrong tool), or more accurately a new Utilities/Maintenance/cid-normalize.sh helper (follow-up) that invokes @web3-storage/w3cli per Documentation/docs/contributing/upload_binary_data.md;
  • OR we accept the .md5 content-links as-is for this PR and open a follow-up sweep PR that converts .md5.cid across the whole tree (not just this module), matching ITK's own precedent at commit f3899ce8c6.

The tree-wide sweep is probably the less disruptive path — reviewers of this ingest PR can focus on the structural ingest without a CID-conversion diff on top.

AI disclosure

Ingestion pipeline (filter-repo pass order, README scope, commit structure) developed with Claude Code assistance. Every commit locally rebuildable and testable; git log shows the 15 upstream authors preserved across the whitelist-filtered merge.

thewtex added 30 commits May 27, 2015 21:33
Initial import from http://hdl.handle.net/10380/3505
http://insight-journal.org/browse/publication/953

Add missing LICENSE file.

Test data is uploaded to midas3.kitware.com and content links are added
instead.
This is a requirement to add the file to ITK as a Remote Module.
Also fix trailing whitespace, indentation, some comments.
This causes issues when building outside the ITK source tree.
This is not found in Windows.  itk::TimeProbe is a cross-platform replacement.
 This was used for debug code, so that was removed.
Uses std::cout instead of std::cerr.
This is for:

  c:\jenkins\workspace\itkgerritwindows\itk-src\modules\remote\anisotropicdiffusionlbr\include\itkLinearAnisotropicDiffusionLBRImageFilter.h(110)
  : warning C4348:
  'itk::LinearAnisotropicDiffusionLBRImageFilter::GetDiffusion' : redefinition
  of default parameter : parameter 1
This is for:

  c:\jenkins\workspace\itkgerritwindows\itk-src\modules\remote\anisotropicdiffusionlbr\include\itkStructureTensorImageFilter.h(95)
  : warning C4348:
  'itk::StructureTensorImageFilter,itk::Image>::IntermediateFilter' :
  redefinition of default parameter : parameter 1

on MSVC.
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 22, 2026
Introduces ingest-remote-module.sh, a driver script that moves an ITK
remote module (Modules/Remote/<Name>.remote.cmake) into the main
source tree at Modules/<Group>/<Name>/ while preserving authorship
and keeping ITK's git pack small.

The script implements the v3 whitelist strategy agreed on PR InsightSoftwareConsortium#6093:

  * filter-repo --paths restricts history to include/, src/, test/,
    wrapping/, CMakeLists.txt, itk-module.cmake -- everything else
    (Old/, examples/, docs/, paper/, .github/, pyproject.toml,
    CTestConfig.cmake, LICENSE, .clang-format, ...) stays in the
    archived upstream repo.
  * --to-subdirectory-filter rewrites paths under the destination.
  * --prune-empty always drops commits whose changes are entirely
    outside the whitelist.
  * A second pass strips CTestConfig.cmake specifically (points at
    a standalone CDash project that does not apply in-tree).
  * The resulting merge is --allow-unrelated-histories --no-ff; the
    commit message carries the upstream URL + tip SHA plus
    Co-authored-by: trailers for every upstream contributor derived
    from the filter-repo'd git log.

README.md gives the human operator the quick-start recipe (five
commits per module, one PR per module) and a --dry-run walkthrough
for previewing an ingest before committing.

Intended follow-ups handled by the caller, not by this script:

  * DOC: the in-tree README pointing at the archived upstream.
  * COMP: deletion of Modules/Remote/<Name>.remote.cmake.
  * ENH: -DModule_<Name>:BOOL=ON in pyproject.toml configure-ci.
  * STYLE: optional .md5/.shaNNN -> .cid content-link conversion
    (may be deferred to a tree-wide sweep PR following the
    f3899ce precedent).
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 22, 2026
Four long-form documents landing alongside ingest-remote-module.sh:

  INGESTION_STRATEGY.md
    Policy document.  Whitelist definition, mode selection (full /
    filtered / squash), attribution floor, CID-normalization pipeline,
    examples-relocation policy.  Codifies the PR InsightSoftwareConsortium#6093 consensus that
    commit count is NOT a gate -- only size metrics are -- so modules
    with hundreds of genuine upstream commits can land in full-history
    mode as long as pack-delta and blob-size thresholds are met.

  AUDIT_DESIGN.md
    Design notes for the pre-ingest audit pass: blob-size histogram,
    strip-candidate path detection (paths present in pre-tip history
    but absent in tip), copyright-review flag for PDFs / videos /
    large images, recommend_mode() pseudocode.

  CLEANUP_CHECKLIST.md
    What the post-merge STYLE commit checks (now a safety-net since
    the whitelist handles the common case at graft time).  Still used
    for copyright review and for Mode B residual-blob stripping.

  AGENTS.md
    Guidance for AI coding agents running this workflow.  Pre-flight
    gates, decision points (non-Apache license, raw binary test assets,
    non-whitelisted paths the module needs, CID-normalization gap,
    examples/ routing), escalation triggers for handing back to the
    human.  Explicit "don't do these things" section covers common
    pitfalls (re-squash-silently, widen-whitelist-without-documenting,
    force-push-ingest-PR).

These documents were developed and iterated on across PR InsightSoftwareConsortium#6061,
InsightSoftwareConsortium#6085, InsightSoftwareConsortium#6086, and especially InsightSoftwareConsortium#6093 (the thread that produced the v3
whitelist + CID-normalization approach).  Landing them in-tree under
Utilities/Maintenance/RemoteModuleIngest/ makes future changes to the
strategy reviewable via standard PR process rather than through PR
comment updates on long-running threads.
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 22, 2026
Introduces ingest-remote-module.sh, a driver script that moves an ITK
remote module (Modules/Remote/<Name>.remote.cmake) into the main
source tree at Modules/<Group>/<Name>/ while preserving authorship
and keeping ITK's git pack small.

The script implements the v3 whitelist strategy agreed on PR InsightSoftwareConsortium#6093:

  * filter-repo --paths restricts history to include/, src/, test/,
    wrapping/, CMakeLists.txt, itk-module.cmake -- everything else
    (Old/, examples/, docs/, paper/, .github/, pyproject.toml,
    CTestConfig.cmake, LICENSE, .clang-format, ...) stays in the
    archived upstream repo.
  * --to-subdirectory-filter rewrites paths under the destination.
  * --prune-empty always drops commits whose changes are entirely
    outside the whitelist.
  * A second pass strips CTestConfig.cmake specifically (points at
    a standalone CDash project that does not apply in-tree).
  * The resulting merge is --allow-unrelated-histories --no-ff; the
    commit message carries the upstream URL + tip SHA plus
    Co-authored-by: trailers for every upstream contributor derived
    from the filter-repo'd git log.

README.md gives the human operator the quick-start recipe (five
commits per module, one PR per module) and a --dry-run walkthrough
for previewing an ingest before committing.

Intended follow-ups handled by the caller, not by this script:

  * DOC: the in-tree README pointing at the archived upstream.
  * COMP: deletion of Modules/Remote/<Name>.remote.cmake.
  * ENH: -DModule_<Name>:BOOL=ON in pyproject.toml configure-ci.
  * STYLE: optional .md5/.shaNNN -> .cid content-link conversion
    (may be deferred to a tree-wide sweep PR following the
    f3899ce precedent).
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 22, 2026
Four long-form documents landing alongside ingest-remote-module.sh:

  INGESTION_STRATEGY.md
    Policy document.  Whitelist definition, mode selection (full /
    filtered / squash), attribution floor, CID-normalization pipeline,
    examples-relocation policy.  Codifies the PR InsightSoftwareConsortium#6093 consensus that
    commit count is NOT a gate -- only size metrics are -- so modules
    with hundreds of genuine upstream commits can land in full-history
    mode as long as pack-delta and blob-size thresholds are met.

  AUDIT_DESIGN.md
    Design notes for the pre-ingest audit pass: blob-size histogram,
    strip-candidate path detection (paths present in pre-tip history
    but absent in tip), copyright-review flag for PDFs / videos /
    large images, recommend_mode() pseudocode.

  CLEANUP_CHECKLIST.md
    What the post-merge STYLE commit checks (now a safety-net since
    the whitelist handles the common case at graft time).  Still used
    for copyright review and for Mode B residual-blob stripping.

  AGENTS.md
    Guidance for AI coding agents running this workflow.  Pre-flight
    gates, decision points (non-Apache license, raw binary test assets,
    non-whitelisted paths the module needs, CID-normalization gap,
    examples/ routing), escalation triggers for handing back to the
    human.  Explicit "don't do these things" section covers common
    pitfalls (re-squash-silently, widen-whitelist-without-documenting,
    force-push-ingest-PR).

These documents were developed and iterated on across PR InsightSoftwareConsortium#6061,
InsightSoftwareConsortium#6085, InsightSoftwareConsortium#6086, and especially InsightSoftwareConsortium#6093 (the thread that produced the v3
whitelist + CID-normalization approach).  Landing them in-tree under
Utilities/Maintenance/RemoteModuleIngest/ makes future changes to the
strategy reviewable via standard PR process rather than through PR
comment updates on long-running threads.
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 22, 2026
Introduces ingest-remote-module.sh, a driver script that moves an ITK
remote module (Modules/Remote/<Name>.remote.cmake) into the main
source tree at Modules/<Group>/<Name>/ while preserving authorship
and keeping ITK's git pack small.

The script implements the v3 whitelist strategy agreed on PR InsightSoftwareConsortium#6093:

  * filter-repo --paths restricts history to include/, src/, test/,
    wrapping/, CMakeLists.txt, itk-module.cmake -- everything else
    (Old/, examples/, docs/, paper/, .github/, pyproject.toml,
    CTestConfig.cmake, LICENSE, .clang-format, ...) stays in the
    archived upstream repo.
  * --to-subdirectory-filter rewrites paths under the destination.
  * --prune-empty always drops commits whose changes are entirely
    outside the whitelist.
  * A second pass strips CTestConfig.cmake specifically (points at
    a standalone CDash project that does not apply in-tree).
  * The resulting merge is --allow-unrelated-histories --no-ff; the
    commit message carries the upstream URL + tip SHA plus
    Co-authored-by: trailers for every upstream contributor derived
    from the filter-repo'd git log.

README.md gives the human operator the quick-start recipe (five
commits per module, one PR per module) and a --dry-run walkthrough
for previewing an ingest before committing.

Intended follow-ups handled by the caller, not by this script:

  * DOC: the in-tree README pointing at the archived upstream.
  * COMP: deletion of Modules/Remote/<Name>.remote.cmake.
  * ENH: -DModule_<Name>:BOOL=ON in pyproject.toml configure-ci.
  * STYLE: optional .md5/.shaNNN -> .cid content-link conversion
    (may be deferred to a tree-wide sweep PR following the
    f3899ce precedent).
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 22, 2026
Four long-form documents landing alongside ingest-remote-module.sh:

  INGESTION_STRATEGY.md
    Policy document.  Whitelist definition, mode selection (full /
    filtered / squash), attribution floor, CID-normalization pipeline,
    examples-relocation policy.  Codifies the PR InsightSoftwareConsortium#6093 consensus that
    commit count is NOT a gate -- only size metrics are -- so modules
    with hundreds of genuine upstream commits can land in full-history
    mode as long as pack-delta and blob-size thresholds are met.

  AUDIT_DESIGN.md
    Design notes for the pre-ingest audit pass: blob-size histogram,
    strip-candidate path detection (paths present in pre-tip history
    but absent in tip), copyright-review flag for PDFs / videos /
    large images, recommend_mode() pseudocode.

  CLEANUP_CHECKLIST.md
    What the post-merge STYLE commit checks (now a safety-net since
    the whitelist handles the common case at graft time).  Still used
    for copyright review and for Mode B residual-blob stripping.

  AGENTS.md
    Guidance for AI coding agents running this workflow.  Pre-flight
    gates, decision points (non-Apache license, raw binary test assets,
    non-whitelisted paths the module needs, CID-normalization gap,
    examples/ routing), escalation triggers for handing back to the
    human.  Explicit "don't do these things" section covers common
    pitfalls (re-squash-silently, widen-whitelist-without-documenting,
    force-push-ingest-PR).

These documents were developed and iterated on across PR InsightSoftwareConsortium#6061,
InsightSoftwareConsortium#6085, InsightSoftwareConsortium#6086, and especially InsightSoftwareConsortium#6093 (the thread that produced the v3
whitelist + CID-normalization approach).  Landing them in-tree under
Utilities/Maintenance/RemoteModuleIngest/ makes future changes to the
strategy reviewable via standard PR process rather than through PR
comment updates on long-running threads.
@hjmjohnson hjmjohnson force-pushed the ingest-AnisotropicDiffusionLBR branch from e15148a to 5a3f8ee Compare April 22, 2026 16:45
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 22, 2026
Per @dzenanz nitpick on PR InsightSoftwareConsortium#6093: keep -DITK_COMPUTER_MEMORY_SIZE (the
line carrying the closing ''') at the end of the cmd variable so that
future -DModule_X:BOOL=ON additions are +1-line diffs without moving
the closing-quote position.
@hjmjohnson
Copy link
Copy Markdown
Member Author

Force-pushed the re-ingest with the tightened v3 pipeline. The original inline comment SHAs became stale on the force-push, so summarizing the fixes here:

Fixes landed on the new 5-commit series

05b9a9b57f COMP: Fix upstream bugs in ingested AnisotropicDiffusionLBR

  • itk-module.cmake — replaced file(READ README.rst DOCUMENTATION) with an inline set(DOCUMENTATION ...) per ITK in-tree convention. README.rst is (correctly) excluded by the v3 whitelist, so the old form failed at configure time (the CI error @dzenanz flagged).
  • itk-module.cmake — dropped ITKIOSpatialObjects and ITKMetaIO from DEPENDS per Greptile P2. Neither is referenced by any public header or test driver in the ingested tree.
  • itkStructureTensorImageFilter.husing Superclass = ImageToImageFilter<TImage, TImage>; was wrong (class inherits from ImageToImageFilter<TImage, TTensorImage>, where TTensorImage defaults to Image<SymmetricSecondRankTensor<...>>). Fixed to match the actual base per Greptile P1. Pre-existing upstream bug worth fixing now that the code is in-tree (ITK-style-conformance-always-in-scope per the v3 strategy).

5a3f8ee413 ENH: Enable AnisotropicDiffusionLBR in CI via configure-ci

Per @dzenanz nitpick: -DITK_COMPUTER_MEMORY_SIZE:STRING=11 stays at the end of the cmd variable (keeping the closing ''' attached to it), and the new -DModule_AnisotropicDiffusionLBR:BOOL=ON goes on the line before. Future -DModule_X:BOOL=ON additions are now +1-line diffs with no quote-movement.

Whitelist history verification — v3 flaw caught and fixed in #6098

The previous v3 ingest had two scaffolding paths leak through the directory-level whitelist: test/azure-pipelines.yml (6 historical commits) and test/Docker/Dockerfile + test/Docker/*.sh (8 historical commits each). These lived inside the whitelisted test/ directory and were only caught by a history-wide audit.

Fix landed in #6098: the ingest-remote-module.sh driver now runs a second filter-repo pass that strips scaffolding basenames (Dockerfile, azure-pipelines*.yml, Jenkinsfile, .github, .circleci, [Dd]ocker/, and about 20 others) from any path in the module tree, followed by a verify-whitelist-history.sh scan that aborts the ingest if any scaffolding pattern is still reachable in any commit of the filtered history.

Re-ingest ran with the tightened pipeline and passes verification: 70 distinct paths across the entire ingested history, all within include/ src/ test/ wrapping/ CMakeLists.txt itk-module.cmake, zero scaffolding leaks.

Current branch (5 commits)
5a3f8ee413  ENH:  Enable AnisotropicDiffusionLBR in CI via configure-ci
612d8409dd  COMP: Remove AnisotropicDiffusionLBR.remote.cmake; now in-tree
05b9a9b57f  COMP: Fix upstream bugs in ingested AnisotropicDiffusionLBR
7170913f4b  DOC:  Add README.md pointing at archived upstream for AnisotropicDiffusionLBR
d022eb0c23  ENH:  Ingest ITKAnisotropicDiffusionLBR into Modules/Filtering   [merge]
─────────── (upstream/main 95d3db31e4)

CI will re-run on the new tip. CID normalization still pending (24 .md5 stubs → .cid) — must complete before merge per #6098 rules.

hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 22, 2026
Introduces ingest-remote-module.sh, a driver script that moves an ITK
remote module (Modules/Remote/<Name>.remote.cmake) into the main
source tree at Modules/<Group>/<Name>/ while preserving authorship
and keeping ITK's git pack small.

The script implements the v3 whitelist strategy agreed on PR InsightSoftwareConsortium#6093:

  * filter-repo --paths restricts history to include/, src/, test/,
    wrapping/, CMakeLists.txt, itk-module.cmake -- everything else
    (Old/, examples/, docs/, paper/, .github/, pyproject.toml,
    CTestConfig.cmake, LICENSE, .clang-format, ...) stays in the
    archived upstream repo.
  * --to-subdirectory-filter rewrites paths under the destination.
  * --prune-empty always drops commits whose changes are entirely
    outside the whitelist.
  * A second pass strips CTestConfig.cmake specifically (points at
    a standalone CDash project that does not apply in-tree).
  * The resulting merge is --allow-unrelated-histories --no-ff; the
    commit message carries the upstream URL + tip SHA plus
    Co-authored-by: trailers for every upstream contributor derived
    from the filter-repo'd git log.

README.md gives the human operator the quick-start recipe (five
commits per module, one PR per module) and a --dry-run walkthrough
for previewing an ingest before committing.

Intended follow-ups handled by the caller, not by this script:

  * DOC: the in-tree README pointing at the archived upstream.
  * COMP: deletion of Modules/Remote/<Name>.remote.cmake.
  * ENH: -DModule_<Name>:BOOL=ON in pyproject.toml configure-ci.
  * STYLE: optional .md5/.shaNNN -> .cid content-link conversion
    (may be deferred to a tree-wide sweep PR following the
    f3899ce precedent).
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 22, 2026
Four long-form documents landing alongside ingest-remote-module.sh:

  INGESTION_STRATEGY.md
    Policy document.  Whitelist definition, mode selection (full /
    filtered / squash), attribution floor, CID-normalization pipeline,
    examples-relocation policy.  Codifies the PR InsightSoftwareConsortium#6093 consensus that
    commit count is NOT a gate -- only size metrics are -- so modules
    with hundreds of genuine upstream commits can land in full-history
    mode as long as pack-delta and blob-size thresholds are met.

  AUDIT_DESIGN.md
    Design notes for the pre-ingest audit pass: blob-size histogram,
    strip-candidate path detection (paths present in pre-tip history
    but absent in tip), copyright-review flag for PDFs / videos /
    large images, recommend_mode() pseudocode.

  CLEANUP_CHECKLIST.md
    What the post-merge STYLE commit checks (now a safety-net since
    the whitelist handles the common case at graft time).  Still used
    for copyright review and for Mode B residual-blob stripping.

  AGENTS.md
    Guidance for AI coding agents running this workflow.  Pre-flight
    gates, decision points (non-Apache license, raw binary test assets,
    non-whitelisted paths the module needs, CID-normalization gap,
    examples/ routing), escalation triggers for handing back to the
    human.  Explicit "don't do these things" section covers common
    pitfalls (re-squash-silently, widen-whitelist-without-documenting,
    force-push-ingest-PR).

These documents were developed and iterated on across PR InsightSoftwareConsortium#6061,
InsightSoftwareConsortium#6085, InsightSoftwareConsortium#6086, and especially InsightSoftwareConsortium#6093 (the thread that produced the v3
whitelist + CID-normalization approach).  Landing them in-tree under
Utilities/Maintenance/RemoteModuleIngest/ makes future changes to the
strategy reviewable via standard PR process rather than through PR
comment updates on long-running threads.
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 22, 2026
Per @dzenanz nitpick on PR InsightSoftwareConsortium#6093: keep -DITK_COMPUTER_MEMORY_SIZE (the
line carrying the closing ''') at the end of the cmd variable so that
future -DModule_X:BOOL=ON additions are +1-line diffs without moving
the closing-quote position.
@hjmjohnson hjmjohnson force-pushed the ingest-AnisotropicDiffusionLBR branch from 25b17d7 to 5a46182 Compare April 22, 2026 18:10
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 22, 2026
Introduces ingest-remote-module.sh, a driver script that moves an ITK
remote module (Modules/Remote/<Name>.remote.cmake) into the main
source tree at Modules/<Group>/<Name>/ while preserving authorship
and keeping ITK's git pack small.

The script implements the v3 whitelist strategy agreed on PR InsightSoftwareConsortium#6093:

  * filter-repo --paths restricts history to include/, src/, test/,
    wrapping/, CMakeLists.txt, itk-module.cmake -- everything else
    (Old/, examples/, docs/, paper/, .github/, pyproject.toml,
    CTestConfig.cmake, LICENSE, .clang-format, ...) stays in the
    archived upstream repo.
  * --to-subdirectory-filter rewrites paths under the destination.
  * --prune-empty always drops commits whose changes are entirely
    outside the whitelist.
  * A second pass strips CTestConfig.cmake specifically (points at
    a standalone CDash project that does not apply in-tree).
  * The resulting merge is --allow-unrelated-histories --no-ff; the
    commit message carries the upstream URL + tip SHA plus
    Co-authored-by: trailers for every upstream contributor derived
    from the filter-repo'd git log.

README.md gives the human operator the quick-start recipe (five
commits per module, one PR per module) and a --dry-run walkthrough
for previewing an ingest before committing.

Intended follow-ups handled by the caller, not by this script:

  * DOC: the in-tree README pointing at the archived upstream.
  * COMP: deletion of Modules/Remote/<Name>.remote.cmake.
  * ENH: -DModule_<Name>:BOOL=ON in pyproject.toml configure-ci.
  * STYLE: optional .md5/.shaNNN -> .cid content-link conversion
    (may be deferred to a tree-wide sweep PR following the
    f3899ce precedent).
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 22, 2026
Four long-form documents landing alongside ingest-remote-module.sh:

  INGESTION_STRATEGY.md
    Policy document.  Whitelist definition, mode selection (full /
    filtered / squash), attribution floor, CID-normalization pipeline,
    examples-relocation policy.  Codifies the PR InsightSoftwareConsortium#6093 consensus that
    commit count is NOT a gate -- only size metrics are -- so modules
    with hundreds of genuine upstream commits can land in full-history
    mode as long as pack-delta and blob-size thresholds are met.

  AUDIT_DESIGN.md
    Design notes for the pre-ingest audit pass: blob-size histogram,
    strip-candidate path detection (paths present in pre-tip history
    but absent in tip), copyright-review flag for PDFs / videos /
    large images, recommend_mode() pseudocode.

  CLEANUP_CHECKLIST.md
    What the post-merge STYLE commit checks (now a safety-net since
    the whitelist handles the common case at graft time).  Still used
    for copyright review and for Mode B residual-blob stripping.

  AGENTS.md
    Guidance for AI coding agents running this workflow.  Pre-flight
    gates, decision points (non-Apache license, raw binary test assets,
    non-whitelisted paths the module needs, CID-normalization gap,
    examples/ routing), escalation triggers for handing back to the
    human.  Explicit "don't do these things" section covers common
    pitfalls (re-squash-silently, widen-whitelist-without-documenting,
    force-push-ingest-PR).

These documents were developed and iterated on across PR InsightSoftwareConsortium#6061,
InsightSoftwareConsortium#6085, InsightSoftwareConsortium#6086, and especially InsightSoftwareConsortium#6093 (the thread that produced the v3
whitelist + CID-normalization approach).  Landing them in-tree under
Utilities/Maintenance/RemoteModuleIngest/ makes future changes to the
strategy reviewable via standard PR process rather than through PR
comment updates on long-running threads.
@hjmjohnson
Copy link
Copy Markdown
Member Author

@greptileai review this draft before I make it official.

The previous review was against an older tip; the branch has since been force-pushed through multiple rounds (v3 pipeline re-ingest, gersemi STYLE, KWStyle private: fix in itkLinearAnisotropicDiffusionLBRImageFilter.hxx). Current tip: 5a46182206. All CI platforms now green except the accepted ghostflow-check-main.

Comment thread Modules/Filtering/AnisotropicDiffusionLBR/CMakeLists.txt Outdated
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 22, 2026
Per @dzenanz nitpick on PR InsightSoftwareConsortium#6093: keep -DITK_COMPUTER_MEMORY_SIZE (the
line carrying the closing ''') at the end of the cmd variable so that
future -DModule_X:BOOL=ON additions are +1-line diffs without moving
the closing-quote position.
@hjmjohnson hjmjohnson force-pushed the ingest-AnisotropicDiffusionLBR branch from 5a46182 to 86c1629 Compare April 22, 2026 21:51
hjmjohnson and others added 6 commits April 22, 2026 22:23
Brings AnisotropicDiffusionLBR from a configure-time remote fetch into the ITK
source tree at Modules/Filtering/AnisotropicDiffusionLBR/ using the v3 whitelist
filter-repo pipeline.

Upstream repo: https://github.com/InsightSoftwareConsortium/ITKAnisotropicDiffusionLBR.git
Upstream tip:  203260b929acda68a1f64b1267b0d89e825904ec
Ingest date:   2026-04-22

Whitelist passes (git filter-repo):
  - --path include --path src --path test --path wrapping
  - --path CMakeLists.txt --path itk-module.cmake
  - --to-subdirectory-filter Modules/Filtering/AnisotropicDiffusionLBR
  - --prune-empty always
  - (if present) second pass: invert CTestConfig.cmake

Outcome: 168 upstream commits -> 88 surviving;
15 distinct authors preserved; git blame walks across the
merge boundary to original authors.

Content-link inventory: .md5=24  .shaNNN=0  .cid=0
TODO before merge: convert 24 non-.cid content-link(s) to .cid.

Primary author: Matt McCormick <matt.mccormick@kitware.com>

Co-authored-by: Bradley Lowekamp <blowekamp@mail.nih.gov>
Co-authored-by: Dženan Zukić <dzenan.zukic@kitware.com>
Co-authored-by: Francois Budin <francois.budin@gmail.com>
Co-authored-by: Hans J. Johnson <hans-johnson@uiowa.edu>
Co-authored-by: Hans Johnson <hans-johnson@uiowa.edu>
Co-authored-by: Hans Johnson <hans.j.johnson@gmail.com>
Co-authored-by: Jon Haitz Legarreta <jhlegarreta@vicomtech.org>
Co-authored-by: Jon Haitz Legarreta Gorroño <jhlegarreta@vicomtech.org>
Co-authored-by: Jon Haitz Legarreta Gorroño <jon.haitz.legarreta@gmail.com>
Co-authored-by: Mathew J. Seng <mathewseng@gmail.com>
Co-authored-by: Mathew Seng <mathewseng@gmail.com>
Co-authored-by: Matt McCormick <matt@mmmccormick.com>
Co-authored-by: Matthew McCormick <matt@mmmccormick.com>
Co-authored-by: Tom Birdsong <tom.birdsong@kitware.com>
…sionLBR

Provides in-tree pointer to the upstream ITKAnisotropicDiffusionLBR repo
and documents what was deliberately left outside the whitelist (Old/,
examples/, test/azure-pipelines.yml, test/Docker/, README.rst, packaging
scaffolding) and why.
Two defects noticed during the ingest review; both are pre-existing
in the upstream tip.  Fixing them here keeps subsequent commits
in this series able to configure + build + test.

  * itk-module.cmake called file(READ README.rst) at configure time,
    but README.rst was stripped by the v3 whitelist (not an ITK
    module-tree file).  Replace with an inline set(DOCUMENTATION ...)
    per ITK in-tree convention; drop the redundant ITKIOSpatialObjects
    and ITKMetaIO entries from DEPENDS (greptile P2 — no in-source
    references).

  * itkStructureTensorImageFilter used
      using Superclass = ImageToImageFilter<TImage, TImage>;
    but the class inherits from
      ImageToImageFilter<TImage, TTensorImage>.
    TTensorImage defaults to Image<SymmetricSecondRankTensor<...>,...>,
    not TImage, so the alias is wrong.  Any downstream code resolving
    Superclass::OutputImageType would get the wrong type.  Fixed
    (greptile P1).
Per @dzenanz nitpick on PR InsightSoftwareConsortium#6093: keep -DITK_COMPUTER_MEMORY_SIZE (the
line carrying the closing ''') at the end of the cmd variable so that
future -DModule_X:BOOL=ON additions are +1-line diffs without moving
the closing-quote position.
Upstream ITKAnisotropicDiffusionLBR's CMakeLists.txt files predate the
ITK gersemi pre-commit hook.  Reformat the three ingested files to
satisfy .gersemi.config (0.19.3): collapse single-arg set(), expand
itk_add_test() arg lists, lowercase command names.

No functional change.
@hjmjohnson hjmjohnson force-pushed the ingest-AnisotropicDiffusionLBR branch from 86c1629 to 0fe0ab3 Compare April 22, 2026 22:25
@hjmjohnson hjmjohnson merged commit ceedf98 into InsightSoftwareConsortium:main Apr 23, 2026
16 of 18 checks passed
@hjmjohnson hjmjohnson deleted the ingest-AnisotropicDiffusionLBR branch April 23, 2026 02:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Filtering Issues affecting the Filtering module area:Python wrapping Python bindings for a class area:Remotes Issues affecting the Remote module type:Data Changes to testing data type:Enhancement Improvement of existing methods or implementation type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants