CI: build relocatable packages via TheRock SDK#262
Open
Conversation
Adds a CI workflow that builds DEB/RPM/TGZ packages of TransferBench against the TheRock nightly ROCm SDK, modeled on the ROCmValidationSuite packaging workflow. Packages install to /opt/rocm/extras-<MAJOR> with $ORIGIN-relative RPATH so they are relocatable. - build_packages_local.sh: single source of truth for both local and CI builds. Detects Ubuntu vs AlmaLinux/manylinux, installs deps, fetches TheRock SDK tarball, configures CMake with relocatable RPATH and the new BUILD_RELOCATABLE_PACKAGE option, builds, and invokes CPack for DEB/RPM/TGZ. - .github/workflows/build-relocatable-packages.yml: parallel Ubuntu 22.04 + manylinux_2_28 jobs triggered on push, PR, daily cron, and workflow_dispatch. OIDC-based S3 upload gated on AWS_S3_BUCKET being set; apt/yum repo metadata generated for non-PR builds. Build report artifact summarizes S3 paths. - .github/workflows/README_BUILD_PACKAGES.md: workflow docs covering triggers, local usage, S3 layout, IAM trust policy, and apt/yum install snippets. - CMakeLists.txt: new BUILD_RELOCATABLE_PACKAGE option that bypasses rocm_install/rocm_create_package, names the package amdrocm<MAJOR>-transferbench, and honors caller-set install prefix and CPACK_*_PACKAGE_RELEASE env vars. Default cmake .. behavior is unchanged. Co-Authored-By: Claude Opus 4 <[email protected]>
The tarballs are published as: therock-dist-linux-<family>-<version>.tar.gz not <family>-<version>.tar.gz as I had it. Also TheRock does not publish a per-family LATEST.txt, so the auto-fetch path now lists the bucket via S3 ListObjectsV2 and picks the highest version key with `sort -V`. Updates the pinned fallback to a version that actually exists on the bucket today (7.13.0a20260423). Co-Authored-By: Claude Opus 4 <[email protected]>
The bucket includes non-release ad-hoc builds (e.g. therock-dist-linux-gfx94X-dcgpu-ADHOCBUILD-7.0.0rc20250625.tar.gz) which `sort -V | tail -1` was selecting because 'A' lexically sorts after digits. The downstream `printf '%02d'` then crashed trying to parse `ADHOCBUILD-7` as the ROCm major. Restrict the auto-fetch grep to keys matching <prefix><MAJOR>.<MINOR>.<patch+suffix>.tar.gz so only properly versioned releases are considered. Co-Authored-By: Claude Opus 4 <[email protected]>
- Gate S3 upload + repo metadata steps on github.repository ==
'ROCm/TransferBench' so forks that set AWS_S3_BUCKET don't try
unauthenticated uploads.
- Add EUID root check to build_packages_local.sh up front (clearer
error than failing later in apt/dnf).
- Rename TAROBALL_BASE -> TARBALL_BASE (typo).
- Sanitize PKG_RELEASE: collapse non-alphanumerics into dots so
feature-branch names stay valid in DEB/RPM release fields.
- Drop ${ROCM_PATH}/lib from RPATH_LIST so the ephemeral SDK
download path is not embedded into the packaged binary.
- Make CPACK_RPM_EXCLUDE_FROM_AUTO_FILELIST_ADDITION track the
actual CPACK_PACKAGING_INSTALL_PREFIX instead of hard-coded
/opt/rocm/extras-N paths.
Co-Authored-By: Claude Opus 4 <[email protected]>
There was a problem hiding this comment.
Pull request overview
Adds a GitHub Actions-based CI path (and a local build script) to produce relocatable TransferBench packages using TheRock ROCm SDK, with a new CMake packaging mode that drives CPack directly.
Changes:
- Added
build_packages_local.shto download TheRock SDK, configure relocatable RPATH, build, and run CPack to emit DEB/RPM/TGZ. - Added
.github/workflows/build-relocatable-packages.ymlto build packages on Ubuntu 22.04 and manylinux_2_28, upload artifacts, and optionally publish to S3 via OIDC with repo metadata generation. - Updated
CMakeLists.txtwith aBUILD_RELOCATABLE_PACKAGEoption to bypassrocm_create_packageand set CPack metadata/package naming.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
build_packages_local.sh |
Local/CI build driver: installs deps, downloads TheRock SDK, configures relocatable install/RPATH, builds + runs CPack. |
CMakeLists.txt |
Adds relocatable packaging mode and new package naming/CPack settings. |
.github/workflows/build-relocatable-packages.yml |
CI workflow to build/package on Ubuntu + manylinux, upload artifacts, and optionally push to S3 with OIDC + repo metadata. |
.github/workflows/README_BUILD_PACKAGES.md |
Documentation for the new workflow/script, triggers, S3 layout, and usage instructions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Packages declare runtime deps on hsa-rocr/numactl and only ship the TransferBench binary; the install tree is movable via $ORIGIN RPATH but target systems still need the ROCm/HSA runtime. Update README so the claim matches the actual packaging behavior.
nileshnegi
reviewed
Apr 24, 2026
| -DCMAKE_VERBOSE_MAKEFILE=ON | ||
| -DBUILD_RELOCATABLE_PACKAGE=ON | ||
| -DBUILD_LOCAL_GPU_TARGET_ONLY=OFF | ||
| -DENABLE_NIC_EXEC=ON |
Collaborator
There was a problem hiding this comment.
i don't think we want this ON by default
@gilbertlee-amd thoughts?
nileshnegi
reviewed
Apr 24, 2026
| build-essential cmake git curl tar xz-utils ca-certificates pkg-config \ | ||
| python3 python3-pip \ | ||
| libnuma-dev libibverbs-dev rdma-core ibverbs-providers \ | ||
| libopenmpi-dev openmpi-bin \ |
Collaborator
There was a problem hiding this comment.
is this aligned with how TheRock sets/expects MPI dependencies for other packages?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds GitHub Actions workflow and build script for producing relocatable packages (DEB/RPM/TGZ) via the TheRock SDK. This workflow is the primary path for continuous integration and nightly/release builds of TransferBench packages, enabling self-contained deployments without requiring a pre-installed ROCm stack on the target machine.
Prerequisites: Merged #261 (manylinux portability fixes) to ensure builds succeed on manylinux_2_28 (AlmaLinux 8) with gcc 8-era libstdc++.
Implementation
.github/workflows/build-relocatable-packages.yml: Two-job workflow (Ubuntu 22.04 → DEB/TGZ, manylinux_2_28 → RPM/TGZ) that fetches the latest TheRock SDK tarball, builds with-DBUILD_RELOCATABLE_PACKAGE=ON, and uploads artifacts to GitHub Actions + optional S3 (via OIDC).build_packages_local.sh: Standalone script that downloads TheRock SDK tarballs from S3 by ROCm version and GPU family (gfx94X-dcgpu, gfx950-dcgpu, gfx110X-all, gfx120X-all, gfx1151), extracts them, drives CMake with relocatable RPATH, and producesamdrocm<MAJOR>-transferbenchpackages.BUILD_RELOCATABLE_PACKAGEoption that bypassesrocm_install/rocm_create_packageand drives CPack directly with version-namespaced package naming (amdrocm7-transferbench) and relocatable RPATH.S3 and OIDC
Workflow uploads packages to S3 using GitHub OIDC for authentication (no long-lived credentials). S3 paths:
s3://.../nightly/transferbench/{deb,rpm,tar}with apt/yum repo metadata.s3://.../release/transferbench/{deb,rpm,tar}with apt/yum metadata.s3://.../transferbench/<branch>/<run-number>/{ubuntu-22.04,manylinux_2_28}(no metadata).S3 upload only executes in the
ROCm/TransferBenchrepository and requiresAWS_S3_BUCKET(organization variable) andAWS_ROLE_ARN(organization secret).Supported GPU Families
TheRock SDK tarballs are fetched per GPU family:
The workflow defaults to
gfx94X-dcgpu(data-center workhorse); other families can be selected viaworkflow_dispatchinput.Trigger Conditions
develop,mainline,release/**,users/thananon/therock-packages-workflow-v2(the last one is temporary for pre-merge validation of this PR; will be removed after merge)developormainlineworkflow_dispatchwith optional ROCm version/GPU family overrideTesting
Local invocation (no GitHub Actions):
The workflow triggers on push to this PR's branch (
users/thananon/therock-packages-workflow-v2) so the first CI run will validate the full build→package→upload flow before the PR is merged.