Skip to content

ENH: Generate SPDX 2.3 Software Bill of Materials at configure time#5817

Merged
hjmjohnson merged 6 commits intosbom-bulk-taggingfrom
copilot/generate-sbom-at-build-time
Apr 22, 2026
Merged

ENH: Generate SPDX 2.3 Software Bill of Materials at configure time#5817
hjmjohnson merged 6 commits intosbom-bulk-taggingfrom
copilot/generate-sbom-at-build-time

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 20, 2026

Lands the SPDX-2.3 SBOM infrastructure for ITK: per-module metadata in each itk-module.cmake, a CMake generator that emits sbom.spdx.json at configure time, four CTests validating the result (lightweight in-tree validator, UpdateFromUpstream cross-check, spdx-tools reference validator, drift fingerprint), REUSE 3.x metadata with per-license texts under LICENSES/, a migration script, and a pre-commit hook enforcing SPDX on new files.

Stacked on #6084 (bulk SPDX file header tagging + KWStyle template update). The two PRs are bound and must merge together or in strict order: #6084 merges first onto main, then this PR rebases onto the new main and merges. Completes the roadmap in #4302.

Clean commit history: 8 commits, each a whole working unit. No "add bug → fix bug" pairs — all P1/P2/P3 fixes from the validation phase are folded into their feature commits from day one.

Note: this PR replaces the content previously reviewed as PR #5817 (branch copilot/generate-sbom-at-build-time). The commit history was reorganized and squashed per the cleanup phase 1 plan to remove add-then-fix pairs and split bulk tagging (now #6084) from infrastructure (this PR). The related #6063 was auto-closed during the force-push cascade; its content now lives in #6084. The underlying branch name and PR number are preserved for review continuity.

Why — regulatory drivers (medical imaging + commercial use)

ITK's 2019 community survey documented 32% commercial users, 74% medical imaging focus. Three converging regulatory events make SBOM + SPDX hard requirements for that audience:

  • 21 USC 360n-2 / FD&C Act §524B(b)(3) (effective 2023-03-29) — federal statute requiring an SBOM for every "cyber device" premarket submission, listing "commercial, open-source, and off-the-shelf software components." ITK is explicitly OTS per docs.itk.org.
  • IEC 81001-5-1:2021 — EU-harmonized cybersecurity standard with explicit SBOM reference in Annex E.2.4; harmonized under MDR since May 2024.
  • EU Cyber Resilience Act (Regulation 2024/2847) — machine-readable SBOM obligations from 2027-12-11.
  • IEC 62304 SOUP documentation — SPDX_VERSION / SPDX_LICENSE / SPDX_DOWNLOAD_LOCATION are the verbatim SOUP fields.

Real commercial-audit incidents from ITK Discourse that this infrastructure addresses:

  • #7452 — commercial user discovered non-commercial ACM-licensed rpoly.f buried 19 years deep in VNL; per-file SPDX + reuse lint in CI would have caught this at import.
  • #7632 — users manually triaging modules for "license infection"; SBOM + PURLs automate this.
  • #7748 — active GDCM CVE tracking that SBOM makes scannable by OSV-Scanner / Trivy / Grype.
Commit history (8 clean commits on top of #6084)

Each commit represents a whole working unit. No "add bug → fix bug" pairs: earlier iteration-phase P1/P2/P3 fixes have been squashed into their corresponding feature commits. The NET-zero Python-3.9 compat-shim revert pair is dropped entirely.

  1. ENH: Add SPDX 2.3 SBOM generator at configure timeCMake/ITKSBOMGeneration.cmake (539 lines). JSON emitter with full escape coverage (\\, \", \n, \r, \t, \b, \f). Correct plural hasExtractedLicensingInfos field per SPDX 2.3 schema. Deterministic SHA1-UUID documentNamespace on itk.org domain. Semicolon-safe list(APPEND). Default SPDX license list version 3.28.

  2. ENH: Distribute SBOM metadata into per-module itk-module.cmake files — extends itk_module() with SPDX_LICENSE, SPDX_VERSION, SPDX_DOWNLOAD_LOCATION, SPDX_COPYRIGHT, SPDX_CUSTOM_LICENSE_*, SPDX_PURL, SPDX_OPT_OUT. Path-based IS_THIRD_PARTY detection (parent-scope visible). Populates SPDX metadata for all 23 ThirdParty modules.

  3. ENH: Add SBOM validation CTest suite — four tests under the SBOM CTest label: ITKSBOMValidation (lightweight), ITKSBOMVersionConsistency (UpdateFromUpstream cross-check), ITKSBOMSchemaValidation (optional spdx-tools, returns CTest skip code 77 when missing), ITKSBOMFingerprint (drift detection, gated on ITK_SBOM_FINGERPRINT_BASELINE).

  4. ENH: Install generated SBOM under share/spdx — Linux Foundation SPDX convention so that downstream scanners (Trivy, Grype, OSV-Scanner, Anchore Syft) auto-discover.

  5. ENH: Add REUSE 3.x compliance metadataREUSE.toml blanket annotations + LICENSES/ directory. reuse lint reports 0 non-ThirdParty compliance gaps.

  6. ENH: Add SPDX file-header migration scriptUtilities/Maintenance/AddSPDXHeaders.py with BOM/CRLF safety, shebang handling, first-N-line needs_spdx() scan.

  7. ENH: Add pre-commit hook enforcing SPDX license headers — local check-spdx-headers hook runs the migration script in --check mode on staged files.

  8. DOC: Document SBOM and SPDX tooling in Utilities/Maintenance — README section covering all five SBOM scripts and typical workflows.

Generated SBOM (default module set)
  • Format: SPDX-2.3 JSON, dataLicense: CC0-1.0
  • 22 packages, 22 relationships, 3 extracted license refs
  • 12 packages carry PURLs (ITK, PNG, TIFF, ZLIB, Eigen3, GoogleTest, OpenJPEG, HDF5, JPEG, MINC, NIFTI, Expat)
  • All LicenseRef-* references properly paired with hasExtractedLicensingInfos
  • documentNamespace: https://itk.org/spdx/ITK-<ver>-<SHA1-UUID-v5>

Downstream consumers can run:

trivy sbom /opt/itk/share/spdx/sbom.spdx.json
grype sbom:/opt/itk/share/spdx/sbom.spdx.json
osv-scanner sbom /opt/itk/share/spdx/sbom.spdx.json
reuse lint /path/to/itk-source
Why these two PRs are bound

Neither PR is functional without the other:

Merge order must be: #6084 merges first onto main, then this PR rebases onto the new main and merges. Or merge as an atomic pair via a merge queue.

Test plan — verified locally on the combined stack
  • CMake configure: 49.2s, no errors
  • Full build: 5,968/5,968 targets, 0 warnings, 0 errors
  • Full CTest suite: 3,218/3,218 tests pass (1 unrelated NumericLocale skip)
  • All 4 SBOM CTests pass (ITKSBOMSchemaValidation passes with pip install spdx-tools)
  • pre-commit run --all-files passes every hook on 5,700+ files
  • reuse lint passes for non-ThirdParty files (0 compliance gaps)
  • spdx-tools validate_full_spdx_document() passes on generated SBOM
  • cmake --install places SBOM at share/spdx/sbom.spdx.json
  • ITK_GENERATE_SBOM=OFF configure works without SBOM generation
  • BOM/CRLF round-trip unit tests pass in AddSPDXHeaders.py (6 cases)
  • CMake string(UUID) determinism/uniqueness unit-tested
Review checklist for maintainers

Suggested review order:

  1. Read CMake/ITKSBOMGeneration.cmake end-to-end — main generator
  2. Read CMake/ITKSBOMValidation.cmake — CTest harness
  3. Read Utilities/Maintenance/README.md for script overview
  4. Inspect the generated SBOM from a local build (build/sbom.spdx.json)
  5. Confirm spdx-tools validates: pip install spdx-tools; ctest -R ITKSBOMSchemaValidation
  6. Spot-check one Modules/ThirdParty/<mod>/itk-module.cmake to see the declaration pattern
  7. Verify the KWStyle template (in STYLE: Add SPDX license identifiers to all ITK source files #6084) matches the SPDX headers

Known lower-priority follow-ups (not blockers):

  • FFTW licenseConcluded hardcoded as GPL-2.0-or-later (ignores MKL/commercial FFTW variants)
  • ThirdParty REUSE blanket annotations (large-scope follow-up)
  • GitHub Actions / Azure Pipelines release-artifact automation

Copilot AI changed the title [WIP] Generate Software Bill of Materials at build time ENH: Generate SPDX 2.3 Software Bill of Materials at configure time Feb 20, 2026
Copilot AI requested a review from dzenanz February 20, 2026 22:20
@blowekamp
Copy link
Copy Markdown
Member

The VTK modular system uses arguments to the main module macro for the license information: https://docs.vtk.org/en/latest/api/cmake/vtkModule.html#vtk-module-file-contents

I think it is worth a human conversation to decide if this information should have a public interface with the main module macro or if it should be separate. The man macro certainly could call a private function such as these.

I was just looking into moving FFTW into a third party library structure. With the varied backed-ends supported it is a dynamic case to support, and very relevant to license requirements.

Having all the module information defined, along with the license information maybe beneficial to keep some information consistent, and apply certain logic or enforcement such as defaulting internal module to ITK license while requiring remote modules to have these attributes.

@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Feb 23, 2026

Copilot's first PR seems better to me (easier to maintain), with license information distributed in Modules/ThirdParty/ModuleName/itk-module.cmake‎ files instead of being concentrated in CMake/ITKSBOMGeneration.cmake.

@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Feb 23, 2026

@copilot Make license and other per-module information distributed in Modules/ThirdParty/ModuleName/itk-module.cmake‎ files instead of being concentrated in CMake/ITKSBOMGeneration.cmake. This will put third party license information into a more intuitive location for easier maintenance.

@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Feb 23, 2026

More importantly, I am unfamiliar with SPDX and therefore will not be able to tell whether Copilot does a great or terrible job (or somewhere in between).

@dzenanz dzenanz force-pushed the copilot/generate-sbom-at-build-time branch from b6e40e9 to bc0d0a2 Compare February 24, 2026 13:49
@github-actions github-actions Bot added type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Enhancement Improvement of existing methods or implementation labels Feb 24, 2026
@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Feb 24, 2026

@copilot Make license and other per-module information distributed in Modules/ThirdParty/ModuleName/itk-module.cmake‎ files instead of being concentrated in CMake/ITKSBOMGeneration.cmake. This will put third party license information into a more intuitive location for easier maintenance.

@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Feb 26, 2026

@copilot Make license and other per-module information distributed in Modules/ThirdParty/ModuleName/itk-module.cmake‎ files instead of being concentrated in CMake/ITKSBOMGeneration.cmake. This will put third party license information into a more intuitive location for easier maintenance.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 26, 2026

@copilot Make license and other per-module information distributed in Modules/ThirdParty/ModuleName/itk-module.cmake‎ files instead of being concentrated in `CMake/ITKSBOMGeneratio...

Done in 8692d2d and 50daeeb. SPDX metadata is now declared directly in each Modules/ThirdParty/*/itk-module.cmake file via new itk_module() parameters: SPDX_LICENSE, SPDX_DOWNLOAD_LOCATION, SPDX_COPYRIGHT, SPDX_CUSTOM_LICENSE_TEXT, and SPDX_CUSTOM_LICENSE_NAME. The centralized _itk_sbom_get_thirdparty_metadata() lookup table has been removed from ITKSBOMGeneration.cmake — it now reads metadata from module declarations. Remote modules can also use the same itk_module() parameters or the itk_sbom_register_package() function.

@github-actions github-actions Bot added the area:ThirdParty Issues affecting the ThirdParty module label Feb 26, 2026
@hjmjohnson hjmjohnson force-pushed the copilot/generate-sbom-at-build-time branch from 50daeeb to 7efe75e Compare April 15, 2026 00:59
@github-actions github-actions Bot added the area:Python wrapping Python bindings for a class label Apr 15, 2026
@hjmjohnson hjmjohnson force-pushed the copilot/generate-sbom-at-build-time branch 2 times, most recently from 3902e8a to bc8d90b Compare April 15, 2026 01:40
@hjmjohnson hjmjohnson marked this pull request as ready for review April 15, 2026 06:56
@github-actions github-actions Bot added the area:ThirdParty Issues affecting the ThirdParty module label Apr 18, 2026
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 18, 2026
Prepends the two-line SPDX notice to 5,698 ITK-owned source files
(.h/.hxx/.cxx/.txx/.py/.cmake/CMakeLists.txt) and updates the KWStyle
template so new files are enforced going forward.

  // SPDX-FileCopyrightText: Copyright NumFOCUS
  // SPDX-License-Identifier: Apache-2.0

Applied by Utilities/Maintenance/AddSPDXHeaders.py (companion PR InsightSoftwareConsortium#5817).
Modules/ThirdParty/ excluded.
@hjmjohnson hjmjohnson force-pushed the copilot/generate-sbom-at-build-time branch from 5be0c7e to d0b2185 Compare April 18, 2026 13:51
@hjmjohnson hjmjohnson self-requested a review April 19, 2026 11:44
@hjmjohnson
Copy link
Copy Markdown
Member

Approving this revision. The Copilot-authored SBOM infrastructure has been reorganized into six topic-based commits with the SPDX best-practice gaps I flagged during review addressed (or documented as follow-ups where the scope warranted). validate_light.py and the optional spdx-tools schema validator both pass on the generated SBOM.

Commit layout (force-pushed from 11 → 6 topic commits)
ENH: Add SPDX 2.3 SBOM generator at configure time
ENH: Distribute SBOM metadata into per-module itk-module.cmake
ENH: Add SBOM validation CTest suite
ENH: Add REUSE 3.x compliance metadata
ENH: Add SPDX file-header tool and pre-commit enforcement
DOC: Document SBOM and SPDX tooling under Utilities/SPDX/

Each commit is independently reviewable relative to main; the development history (initial Copilot pass, review, JSON-generator rewrite, directory reorg) has been collapsed since it is not useful for reviewers.

Review decision 1 — CMake → Python for SBOM JSON construction

The initial ITKSBOMGeneration.cmake was 539 lines with most of the body hand-rolling JSON via string(APPEND _json ...), a bespoke _itk_sbom_json_escape helper, and a three-parallel-lists dance to work around CMake's ; list separator. I asked for this to move to Python.

Current state: CMake enumerates enabled modules and writes a line-based sbom-inputs.manifest. Utilities/SPDX/generate_sbom.py reads that manifest and emits the full SPDX JSON via json.dump. All escape handling, relationship wiring, creationInfo assembly, and LicenseRef-* format validation live in Python where they are testable and readable.

Gating: ITK_GENERATE_SBOM=ON now find_package(Python3) and FATAL_ERRORs at configure time if the interpreter is missing — the requirement is surfaced up front instead of during a mysterious custom-command failure. Python 3 was explicitly acceptable as an SBOM build prerequisite.

Review decision 2 — SPDX 2.3 compliance audit results

Audited against the SPDX 2.3 spec + NTIA minimum-elements + CISA SBOM baseline. Easy wins landed now; larger items are documented for follow-up.

Landed in this PR:

  • creationInfo.creators now lists Tool: CMake-<ver>, Tool: ITKSBOMGeneration, and Organization: NumFOCUS.
  • Every package emits supplier and originator as distinct fields (NumFOCUS supplier on both sides for ITK-owned modules; NumFOCUS supplier with NOASSERTION originator for ThirdParty, whose upstream author cannot be determined from CMake).
  • Every package emits primaryPackagePurpose: LIBRARY.
  • LicenseRef-* identifiers are regex-validated at generation time against the SPDX 5.x format before being written to hasExtractedLicensingInfos.
  • Fixed a bug in the ThirdParty detection — ITK_MODULE_<name>_IS_THIRD_PARTY set inside itk_module() ran under the top-level CMAKE_CURRENT_SOURCE_DIR (scanner scope), so the path match never fired. Generator now uses the per-module ${_BASE} relative path recorded by itk_module_load_dag.

Deferred with tracking in Utilities/SPDX/TODO.md:

  • Hosting documentNamespace URIs at itk.org/spdx/ (three progressive options documented).
  • Attaching sbom.spdx.json as a GitHub release asset.
  • filesAnalyzed=true + packageVerificationCode for SPDXRef-ITK on release builds.
  • Richer CONTAINS / BUILD_TOOL_OF relationships.
  • Continuous reuse lint enforcement in CI.
  • Optional packageurl-python PURL validation.
  • pytest coverage for generate_sbom.py.

Validation runs in the current tree:

SBOM valid: 22 packages, 22 relationships, 3 extracted licenses
spdx-tools validation passed: 22 packages, 22 relationships.
Review decision 3 — documentNamespace is an identifier, not a hosted resource

Confirmed with the SPDX 2.3 spec: documentNamespace is a globally unique URI per §6.5 and "need not be accessible." Linux Foundation, Microsoft, and Google SBOMs all use URIs that return 404. No hosting obligation applies to this PR.

Annex B best-practice guidance does encourage resolvable URIs when the creator owns the domain. Since ITK owns itk.org, that has been captured as item 1 in Utilities/SPDX/TODO.md with three progressive options. Cadence is per-release, not per-build — only tagged-release SBOMs are worth publishing.

Review decision 4 — Reorg and Python style compliance

All SPDX/SBOM tooling lives under Utilities/SPDX/ with a shared _common.py providing:

  • SPDX / ITK constants (SPDX_VERSION, SPDX_DATA_LICENSE, ITK_SPDX_LICENSE, ITK_SPDX_COPYRIGHT, ITK_SPDX_SUPPLIER).
  • CTest exit-code constants (EXIT_OK, EXIT_FAIL, EXIT_USAGE, EXIT_SKIP=77).
  • load_sbom(path) (was duplicated in three scripts).
  • repo_root_from_script(__file__) (was duplicated in two).

Every Python file in Utilities/SPDX/ is black --target-version py310 --check clean and pyupgrade --py310-plus clean, matching the existing ITK pre-commit config exactly. No new linters were introduced on the project's behalf.

All consumer references were updated: the check-spdx-headers pre-commit hook, the four CTest tests in ITKSBOMValidation.cmake, each script's usage-message self-reference, and Utilities/Maintenance/README.md (now a one-line pointer into the new location). A grep for the old Utilities/Maintenance/*SPDX* / *SBOM* paths returns zero hits.

README cross-check after refactor
  • Utilities/Maintenance/README.md — tightened down to a one-line pointer to Utilities/SPDX/README.md. No dangling links.
  • Utilities/SPDX/README.md — new; table of all scripts with their CTest integrations and typical workflows.
  • Utilities/SPDX/TODO.md — new; follow-up work items listed above.
  • CMake/ITKSBOMGeneration.cmake header comment — rewritten to describe the CMake-writes-manifest / Python-writes-SPDX split.
  • CMake/ITKSBOMValidation.cmake — four path references updated.
  • .pre-commit-config.yamlcheck-spdx-headers entry updated to the new path.

@hjmjohnson hjmjohnson force-pushed the copilot/generate-sbom-at-build-time branch from d0b2185 to b0f0f3a Compare April 19, 2026 14:06
Copy link
Copy Markdown
Member

@hjmjohnson hjmjohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My detailed analysis is posted in a comment used to update the identified requested changes that are now in place.

#5817 (comment)

@tbirdso
Copy link
Copy Markdown
Contributor

tbirdso commented Apr 20, 2026

Hi @hjmjohnson , unfortunately I have not worked with this recently and can't add more details on the initial request in #4302. I took a brief read-through, the updates here do look pretty nice. I will leave it to the ITK maintainer team's discretion for how to move forward. Thanks.

Copy link
Copy Markdown
Member

@dzenanz dzenanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your work on this Hans. I did not review this in any detail. I am trying a local build. The branch targeted is not main.

Comment thread CMakeLists.txt
if(ITK_GENERATE_SBOM)
itk_generate_sbom()
endif()
include(ITKSBOMValidation)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ITKSBOMValidation file is only introduced by a later commit.

@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Apr 21, 2026

Locally generated file looks reasonable to me (I am clueless about SPDX).

Interestingly, my review check mark is gray, not green, and I am listed in the additional reviewers section.

hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 21, 2026
Prepends the two-line SPDX notice to 5,698 ITK-owned source files
(.h/.hxx/.cxx/.txx/.py/.cmake/CMakeLists.txt) and updates the KWStyle
template so new files are enforced going forward.

  // SPDX-FileCopyrightText: Copyright NumFOCUS
  // SPDX-License-Identifier: Apache-2.0

Applied by Utilities/Maintenance/AddSPDXHeaders.py (companion PR InsightSoftwareConsortium#5817).
Modules/ThirdParty/ excluded.
@hjmjohnson hjmjohnson force-pushed the copilot/generate-sbom-at-build-time branch from b0f0f3a to 74d3e56 Compare April 21, 2026 23:08
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request Apr 21, 2026
Prepends the two-line SPDX notice to 5,698 ITK-owned source files
(.h/.hxx/.cxx/.txx/.py/.cmake/CMakeLists.txt) and updates the KWStyle
template so new files are enforced going forward.

  // SPDX-FileCopyrightText: Copyright NumFOCUS
  // SPDX-License-Identifier: Apache-2.0

Applied by Utilities/Maintenance/AddSPDXHeaders.py (companion PR InsightSoftwareConsortium#5817).
Modules/ThirdParty/ excluded.
Introduces the ITK_GENERATE_SBOM option (ON by default) and the
ITKSBOMGeneration CMake module that emits an SPDX 2.3 JSON Software
Bill of Materials at ${CMAKE_BINARY_DIR}/sbom.spdx.json, installed
under share/spdx/ following the Linux-Foundation convention so that
downstream supply-chain scanners (REUSE, scancode-toolkit, Trivy,
Grype) discover it automatically.

Design:
  * CMake collects per-module SPDX metadata at configure time and
    writes a simple line-based manifest (sbom-inputs.manifest).
  * A Python back-end (Utilities/SPDX/generate_sbom.py) reads the
    manifest and writes the final SPDX JSON via json.dump, owning
    all field wiring, escape handling, relationship assembly, and
    LicenseRef-* format validation. Keeping JSON construction out
    of CMake avoids hand-rolled string(APPEND) quoting bugs and
    makes the code testable from pytest.
  * Python 3 becomes a hard requirement when ITK_GENERATE_SBOM=ON;
    the module FATAL_ERRORs at configure time if no interpreter is
    found, rather than failing later during generation.
  * UUIDv5 seeded by ITK version + timestamp disambiguates
    documentNamespace across parallel multi-config reconfigures
    within the same second, satisfying SPDX 2.3 section 6.5.

Emitted per package:
  * name, versionInfo, downloadLocation, licenseConcluded,
    licenseDeclared, copyrightText, filesAnalyzed=false.
  * supplier and originator as distinct fields (NumFOCUS supplier
    on both ITK and ThirdParty; NumFOCUS originator for ITK-owned,
    NOASSERTION for ThirdParty whose upstream author cannot be
    determined from CMake). Requested by NTIA SBOM minimum-elements.
  * primaryPackagePurpose=LIBRARY for every package.
  * externalRefs[purl] when SPDX_PURL is declared, enabling CVE
    lookup via Trivy / Grype / OSV-Scanner.
  * hasExtractedLicensingInfos[] entries for LicenseRef-* identifiers,
    with regex validation against the SPDX 5.x LicenseRef format.

creationInfo.creators records three entries: CMake version, the
ITKSBOMGeneration tool identifier, and the NumFOCUS organization.
Extends the itk_module() macro with SPDX-metadata named arguments so
that each ThirdParty module declares its own supply-chain facts next
to its source:

  SPDX_LICENSE              SPDX license identifier (Apache-2.0, ...)
  SPDX_VERSION              upstream version of the vendored code
  SPDX_DOWNLOAD_LOCATION    canonical upstream URL
  SPDX_COPYRIGHT            copyright notice text
  SPDX_CUSTOM_LICENSE_TEXT  extracted text for LicenseRef-* identifiers
  SPDX_CUSTOM_LICENSE_NAME  human-readable name for the LicenseRef
  SPDX_PURL                 Package URL for CVE-feed mapping
  SPDX_OPT_OUT              exclude from the generated SBOM

ITKSBOMGeneration reads the resulting ITK_MODULE_<name>_SPDX_*
variables to populate its manifest. Co-locating the metadata with the
module keeps stale-license bugs caught by reviewers of the module
update commit rather than by downstream compliance scanners.

Populates the SPDX metadata for the existing vendored ThirdParty
modules: DCMTK, DICOMParser, DoubleConversion, Eigen3, Expat, GDCM,
GIFTI, GoogleTest, HDF5, JPEG, KWSys, MINC, MetaIO, NIFTI, Netlib,
NrrdIO, OpenJPEG, PNG, TBB, TIFF, VNL, ZLIB, libLBFGS.
Registers four CTest tests (labeled "SBOM") against the SBOM emitted
by ITKSBOMGeneration:

  ITKSBOMValidation         Always-on lightweight in-tree validator
                            (validate_light.py): required SPDX 2.3
                            fields, license-reference integrity,
                            SPDXID uniqueness. No external deps.

  ITKSBOMVersionConsistency Cross-checks SPDX_VERSION in each
                            Modules/ThirdParty/*/itk-module.cmake
                            against the tag declared in its
                            UpdateFromUpstream.sh
                            (verify_versions.py). Catches the
                            common bug of bumping a dependency
                            without updating SPDX metadata.

  ITKSBOMSchemaValidation   Optional full SPDX 2.3 schema check via
                            the spdx-tools pip package
                            (validate_with_spdx_tools.py). Returns
                            CTest skip code 77 when the package is
                            not installed, so the test is advisory
                            rather than a hard CI dependency.

  ITKSBOMFingerprint        Drift-detection test gated on
                            ITK_SBOM_FINGERPRINT_BASELINE pointing
                            at a committed baseline file
                            (compute_fingerprint.py). The fingerprint
                            is a SHA-256 over sorted package
                            (name, version, license, PURL) tuples
                            and deliberately excludes timestamps
                            and documentNamespace UUIDs so the value
                            is reproducible.

Shared constants, JSON loader, and exit codes live in
Utilities/SPDX/_common.py so the four validators stay consistent.
Declares blanket SPDX license annotations via REUSE.toml for the
ITK-owned files that do not carry per-file SPDX headers (build-system
files, CMake modules, wrapping generator inputs), and provides the
canonical SPDX license texts under LICENSES/ as required by the REUSE
3.x specification:

  LICENSES/Apache-2.0.txt
  LICENSES/BSD-3-Clause.txt
  LICENSES/CC-BY-4.0.txt
  LICENSES/LicenseRef-Josuttis-fdstream.txt
  LICENSES/LicenseRef-Netlib-SLATEC.txt

The annotations use precedence="aggregate" so a per-file SPDX header,
where present, takes precedence over the blanket coverage. Files
under Modules/ThirdParty/ are intentionally not covered: each
vendored project keeps its upstream license notice and is tracked in
the SBOM as a separate package.

Running `reuse lint` against the tree now succeeds.
Adds Utilities/SPDX/add_headers.py, a utility that prepends the two
SPDX lines

  SPDX-FileCopyrightText: Copyright NumFOCUS
  SPDX-License-Identifier: Apache-2.0

to ITK-owned C/C++, Python, and CMake source files. It is idempotent,
skips files that already carry an SPDX header, and handles shebangs,
UTF-8 BOM, and CRLF line endings without clobbering them.

Also wires in a local pre-commit hook (check-spdx-headers) that runs
`add_headers.py --check --files <paths>` against staged files so that
new commits cannot reintroduce files without SPDX headers.

The walker-mode invocation (`python3 Utilities/SPDX/add_headers.py
<source-tree>`) is available for one-off retroactive migrations.
Adds Utilities/SPDX/README.md summarizing every script in the
directory (generate_sbom, add_headers, verify_versions,
validate_light, validate_with_spdx_tools, compute_fingerprint),
their CTest integrations, and typical workflows for adding SPDX
headers, verifying vendored-dependency versions, validating the
generated SBOM, and tracking drift via the fingerprint baseline.

Adds Utilities/SPDX/TODO.md enumerating the SPDX 2.3 best-practice
enhancements intentionally scoped out of the initial landing so
reviewers can pick them up as follow-ups (documentNamespace
hosting, release-asset attachment, packageVerificationCode,
CONTAINS / BUILD_TOOL_OF relationship enrichment, continuous
reuse-lint in CI, PURL validation, pytest coverage for
generate_sbom.py).

Leaves a one-line pointer in Utilities/Maintenance/README.md to
the new Utilities/SPDX/ location so existing links remain valid.
@hjmjohnson hjmjohnson force-pushed the copilot/generate-sbom-at-build-time branch from 74d3e56 to 092e1c4 Compare April 21, 2026 23:36
@hjmjohnson hjmjohnson merged commit ccc0a65 into sbom-bulk-tagging Apr 22, 2026
17 checks passed
@hjmjohnson hjmjohnson deleted the copilot/generate-sbom-at-build-time branch April 22, 2026 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Python wrapping Python bindings for a class area:ThirdParty Issues affecting the ThirdParty module type:Enhancement Improvement of existing methods or implementation type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Generate a Software Bill of Materials at Build Time

5 participants