Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
0252788
Initial pod communication support (#235)
gilbertlee-amd Feb 20, 2026
9edaae8
Adjust min HIP version in Makefile for pod support
gilbertlee-amd Feb 20, 2026
2b17e62
Adding TB_DUMP_CFG_FILE and fixing a deallocation bug
gilbertlee-amd Feb 21, 2026
2b707b7
Adding gfx1250 to CMakeFiles
gilbertlee-amd Feb 24, 2026
2bb8302
Adjusting how HIP headers are included
gilbertlee-amd Feb 25, 2026
060abc2
Updating a2asweep and scaling presets
gilbertlee-amd Mar 9, 2026
6c2ecf7
Fixing logging to prevent recursive error
gilbertlee-amd Mar 9, 2026
794bcf7
Fixing fabric handle bug
gilbertlee-amd Mar 10, 2026
8de0154
Changing table formatting to make it easier to paste
gilbertlee-amd Mar 10, 2026
4a0f390
Showing num iterations when running in timed mode
gilbertlee-amd Mar 13, 2026
bf49ba4
cuda + MNNVL update & pod presets (#241)
AtlantaPepsi Mar 16, 2026
5e61666
Changing NIC_FILTER to TB_NIC_FILTER
gilbertlee-amd Mar 17, 2026
bec2c5e
prefixing remaining env vars with TB_, fixing potential filesystem ch…
gilbertlee-amd Mar 17, 2026
561e2f7
Fixing TB_PAUSE issue
gilbertlee-amd Mar 18, 2026
94cf3c9
Merge pull request #245 from ROCm/develop
AtlantaPepsi Mar 19, 2026
eb92015
Increase CQ size for high qps (#244)
pierreantoineH Mar 19, 2026
275998b
fix hang when NVML is present but fabricmanager isnt (#246)
AtlantaPepsi Mar 23, 2026
168cdc1
Adding HBM read bandwidth preset (#250)
gilbertlee-amd Mar 28, 2026
fdec7d5
Adding TB_WALLCLOCK_RATE in case wallclock rate is reported as 0
gilbertlee-amd Mar 30, 2026
94e581d
ibv dynamic loading
AtlantaPepsi Apr 8, 2026
ebde17f
ibv dynamicloading; addition of CUMEM flag and compilation fixes
AtlantaPepsi Apr 27, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,28 @@
Documentation for TransferBench is available at
[https://rocm.docs.amd.com/projects/TransferBench](https://rocm.docs.amd.com/projects/TransferBench).

## v1.67.00
### Added
- Initial support for pod communication. Requires compatible hardware / ROCm version and subject to further testing
- This potentially enables GFX/DMA executors to access SRC/DST memory locations on GPUs within the same pod
- Pod membership requires amd-smi however can be skipped by setting TB_FORCE_SINGLE_POD=1
- Support for dumping executed Transfers to a config file specified by TB_DUMP_CFG_FILE
- This will write Transfers that are executed (for example via a preset) to a config file that can then be executed
- Reporting number of iterations run when running in timed mode (NUM_ITERATIONS < 0)
- Adding NIC_CQ_POLL_BATCH to control CQ poll batch size for NIC transfers
- New "hbm" preset which sweeps and tests local HBM read performance
- Added a new TB_WALLCLOCK_RATE that will override GPU GFX wallclock rate if it returns 0 (debug)

### Modified
- DMA-BUF support enablement in CMake changed to ENABLE_DMA_BUF to be more similar to other compile-time options
- Adding extra information to CMake and make build methods to indicate enabled / disabled features
- a2asweep preset changes from USE_FINE_GRAIN to MEM_TYPE to reflect various memory types
- a2asweep preset changes from NUM_CUS to NUM_SUB_EXECS to match with a2a preset naming convention
- scaling preset changes from using USE_FINE_GRAIN to CPU_MEM_TYPE and GPU_MEM_TYPE
- NIC_FILTER renamed to TB_NIC_FILTER for consistency
- DUMP_LINES renamed to TB_DUMP_LINES for consistency
- Dynamically size CQs for NIC transfers in high QPs case

## v1.66.02
### Added
- Adding DMA-BUF support
Expand Down
188 changes: 146 additions & 42 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ if (NOT CMAKE_TOOLCHAIN_FILE)
message(STATUS "CMAKE_TOOLCHAIN_FILE: ${CMAKE_TOOLCHAIN_FILE}")
endif()

set(VERSION_STRING "1.66.02")
set(VERSION_STRING "1.67.00")
project(TransferBench VERSION ${VERSION_STRING} LANGUAGES CXX)

## Load CMake modules
Expand All @@ -24,8 +24,11 @@ list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/cmake")
#==================================================================================================
option(BUILD_LOCAL_GPU_TARGET_ONLY "Build only for GPUs detected on this machine" OFF)
option(ENABLE_NIC_EXEC "Enable RDMA NIC Executor in TransferBench" OFF)
option(ENABLE_IBV_DIRECT "Link libibverbs symbols directly (OFF: resolve via dlsym)" ON)
option(ENABLE_MPI_COMM "Enable MPI Communicator support" OFF)
option(DISABLE_DMABUF "Disable DMA-BUF support for GPU Direct RDMA" ON)
option(ENABLE_DMA_BUF "Enable DMA-BUF support for GPU Direct RDMA" OFF)
option(ENABLE_AMD_SMI "Enable AMD-SMI pod membership queries" OFF)
option(ENABLE_POD_COMM "Enable pod communication" OFF)

# Default GPU architectures to build
#==================================================================================================
Expand All @@ -42,7 +45,8 @@ set(DEFAULT_GPUS
gfx1150
gfx1151
gfx1200
gfx1201)
gfx1201
gfx1250)

## Build only for local GPU architecture
if(BUILD_LOCAL_GPU_TARGET_ONLY)
Expand All @@ -67,7 +71,7 @@ else()
endif()

set(GPU_TARGETS "${SUPPORTED_GPUS}")
message(STATUS "Compiling for ${GPU_TARGETS}")
message(STATUS "- Compiling for ${GPU_TARGETS}")

## NOTE: Reload rocm-cmake in order to update GPU_TARGETS
include(cmake/Dependencies.cmake) # Reloading to use desired GPU_TARGETS instead of defaults
Expand Down Expand Up @@ -132,87 +136,177 @@ if(DEFINED ENV{DISABLE_NIC_EXEC} AND "$ENV{DISABLE_NIC_EXEC}" STREQUAL "1")
message(STATUS "Disabling NIC Executor support as env. flag DISABLE_NIC_EXEC was enabled")
elseif(NOT ENABLE_NIC_EXEC)
message(STATUS "For CMake builds, NIC executor so requires explicit opt-in by setting CMake flag -DENABLE_NIC_EXEC=ON")
message(STATUS "Disabling NIC Executor support")
message(STATUS "- Disabling NIC Executor support")
else()
message(STATUS "Attempting to build with NIC executor support")

find_library(IBVERBS_LIBRARY ibverbs)
find_path(IBVERBS_INCLUDE_DIR infiniband/verbs.h)
if(IBVERBS_LIBRARY AND IBVERBS_INCLUDE_DIR)
add_library(ibverbs SHARED IMPORTED)
set_target_properties(ibverbs PROPERTIES INTERFACE_INCLUDE_DIRECTORIES "${IBVERBS_INCLUDE_DIR}" IMPORTED_LOCATION "${IBVERBS_LIBRARY}" INTERFACE_SYSTEM_INCLUDE_DIRECTORIES "${IBVERBS_INCLUDE_DIR}")
set(IBVERBS_FOUND 1)
message(STATUS "Building with NIC executor support. Can set DISABLE_NIC_EXEC=1 to disable")
message(STATUS "- Building with NIC executor support. Can set DISABLE_NIC_EXEC=1 to disable")
if(ENABLE_IBV_DIRECT)
message(STATUS "- IBV_DIRECT enabled (direct libibverbs linkage); set -DENABLE_IBV_DIRECT=OFF for dlsym path")
else()
message(STATUS "- IBV_DIRECT disabled: libibverbs symbols resolved via dlsym at runtime")
endif()
else()
if(NOT IBVERBS_LIBRARY)
message(WARNING "IBVerbs library not found")
message(WARNING "- IBVerbs library not found")
elseif(NOT IBVERBS_INCLUDE_DIR)
message(WARNING "infiniband/verbs.h not found")
message(WARNING "- infiniband/verbs.h not found")
endif()
message(WARNING "Building without NIC executor support. To use the TransferBench RDMA executor, check if your system has NICs, the NIC drivers are installed, and libibverbs-dev is installed")
message(WARNING "- Building without NIC executor support. To use the TransferBench RDMA executor, check if your system has NICs, the NIC drivers are installed, and libibverbs-dev is installed")
endif()
endif()

## Check for DMA-BUF support (requires IBVERBS_FOUND)
if(IBVERBS_FOUND AND NOT DISABLE_DMABUF)
message(STATUS "Checking for DMA-BUF support...")

# Check for ibv_reg_dmabuf_mr
include(CheckSymbolExists)
set(CMAKE_REQUIRED_INCLUDES ${IBVERBS_INCLUDE_DIR})
set(CMAKE_REQUIRED_LIBRARIES ${IBVERBS_LIBRARY})
check_symbol_exists(ibv_reg_dmabuf_mr "infiniband/verbs.h" HAVE_IBV_DMABUF)

# Check for hsa_amd_portable_export_dmabuf
set(CMAKE_REQUIRED_INCLUDES ${HSA_INCLUDE_DIR})
set(CMAKE_REQUIRED_LIBRARIES ${HSA_LIBRARY})
check_symbol_exists(hsa_amd_portable_export_dmabuf "hsa_ext_amd.h" HAVE_ROCM_DMABUF)

# Enable DMA-BUF only if both APIs are available
if(HAVE_IBV_DMABUF AND HAVE_ROCM_DMABUF)
set(DMABUF_SUPPORT_FOUND 1)
message(STATUS "Building with DMA-BUF support")
if(IBVERBS_FOUND)
if(DEFINED ENV{DISABLE_DMABUF} AND "$ENV{DISABLE_DMABUF}" STREQUAL "1")
message(STATUS "Disabling DMA-BUF support as env. flag DISABLE_DMA was enabled")
elseif(NOT ENABLE_DMABUF)
message(STATUS "For CMake builds, DMA-BUF support requires explicit opt-in by setting CMake flags -DENABLE_DMABUF=ON")
message(STATUS "- Disabling DMA-BUF support")
else()
if(NOT HAVE_IBV_DMABUF AND NOT HAVE_ROCM_DMABUF)
message(WARNING "Building without DMA-BUF support: missing both ibv_reg_dmabuf_mr and ROCm DMA-BUF export")
elseif(NOT HAVE_IBV_DMABUF)
message(WARNING "Building without DMA-BUF support: missing ibv_reg_dmabuf_mr")
message(STATUS "Attempting to build with DMA-BUF support")

# Check for ibv_reg_dmabuf_mr
set(CMAKE_REQUIRED_INCLUDES ${IBVERBS_INCLUDE_DIR})
set(CMAKE_REQUIRED_LIBRARIES ${IBVERBS_LIBRARY})
check_symbol_exists(ibv_reg_dmabuf_mr "infiniband/verbs.h" HAVE_IBV_DMABUF)

# Check for hsa_amd_portable_export_dmabuf
set(CMAKE_REQUIRED_INCLUDES ${HSA_INCLUDE_DIR})
set(CMAKE_REQUIRED_LIBRARIES ${HSA_LIBRARY})
check_symbol_exists(hsa_amd_portable_export_dmabuf "hsa_ext_amd.h" HAVE_ROCM_DMABUF)

# Enable DMA-BUF only if both APIs are available
if(HAVE_IBV_DMABUF AND HAVE_ROCM_DMABUF)
set(DMABUF_SUPPORT_FOUND 1)
message(STATUS "- Building with DMA-BUF support")
else()
message(WARNING "Building without DMA-BUF support: missing ROCm DMA-BUF export")
if(NOT HAVE_IBV_DMABUF AND NOT HAVE_ROCM_DMABUF)
message(WARNING "- Building without DMA-BUF support: missing both ibv_reg_dmabuf_mr and ROCm DMA-BUF export")
elseif(NOT HAVE_IBV_DMABUF)
message(WARNING "- Building without DMA-BUF support: missing ibv_reg_dmabuf_mr")
else()
message(WARNING "- Building without DMA-BUF support: missing ROCm DMA-BUF export")
endif()
endif()
endif()
elseif(NOT DISABLE_DMABUF)
message(WARNING "DMA-BUF support requires ENABLE_NIC_EXEC=ON")
endif()

## Check for MPI support
set(MPI_PATH "" CACHE PATH "Path to MPI installation (takes priority over system MPI)")
if(NOT ENABLE_MPI_COMM)
if(DEFINED ENV{DISABLE_MPI_COMM} AND "$ENV{DISABLE_MPI_COMM}" STREQUAL "1")
message(STATUS "Disabling MPI Communicator support as env. flag DISABLE_MPI_COMM was enabled")
elseif(NOT ENABLE_MPI_COMM)
message(STATUS "For CMake builds, MPI Communicator requires explicit opt-in by setting CMake flag -DENABLE_MPI_COMM=ON")
message(STATUS "Disabling MPI Communicator support")
else()
message(STATUS "Attempting to build with MPI communicator support")
# First check user-specified MPI_PATH (similar to Makefile)
if(MPI_PATH AND EXISTS "${MPI_PATH}/include/mpi.h")
find_library(MPI_LIBRARY NAMES mpi PATHS ${MPI_PATH}/lib NO_DEFAULT_PATH)
if(MPI_LIBRARY)
set(MPI_COMM_FOUND 1)
set(MPI_INCLUDE_DIR "${MPI_PATH}/include")
set(MPI_LINK_DIR "${MPI_PATH}/lib")
message(STATUS "Building with MPI Communicator support (found at MPI_PATH: ${MPI_PATH})")
message(STATUS "- Building with MPI Communicator support (found at MPI_PATH: ${MPI_PATH})")
else()
message(WARNING "Found mpi.h at ${MPI_PATH}/include but could not find MPI library at ${MPI_PATH}/lib")
message(WARNING "- Found mpi.h at ${MPI_PATH}/include but could not find MPI library at ${MPI_PATH}/lib")
endif()
else()
# Fall back to find_package
if(MPI_PATH)
message(STATUS "Unable to find mpi.h at ${MPI_PATH}/include, trying find_package")
message(STATUS "- Unable to find mpi.h at ${MPI_PATH}/include, trying find_package")
endif()
find_package(MPI QUIET)
if(MPI_CXX_FOUND)
set(MPI_COMM_FOUND 1)
message(STATUS "Building with MPI Communicator support (found via find_package)")
message(STATUS "- Using MPI include path: ${MPI_CXX_INCLUDE_PATH}")
message(STATUS "- Using MPI library:: ${MPI_CXX_LIBRARIES}")
message(STATUS "- Building with MPI Communicator support (found via find_package)")
message(STATUS " - Using MPI include path: ${MPI_CXX_INCLUDE_PATH}")
message(STATUS " - Using MPI library: ${MPI_CXX_LIBRARIES}")
else()
message(WARNING "- MPI not found. Please specify appropriate MPI_PATH or install MPI libraries (e.g., OpenMPI or MPICH)")
endif()
endif()
endif()

## Check for AMD-SMI support
if(DEFINED ENV{DISABLE_AMD_SMI} AND "$ENV{DISABLE_AMD_SMI}" STREQUAL "1")
message(STATUS "Disabling AMD-SMI support as env. flag DISABLE_AMD_SMI was enabled")
elseif(NOT ENABLE_AMD_SMI)
message(STATUS "For CMake builds, AMD-SMI support requires explicit opt-in by setting CMake flag -DENABLE_AMD_SMI=ON")
message(STATUS "- Disabling AMD-SMI support")
else()
set(AMD_SMI_EXECUTABLE "amd-smi" CACHE STRING "Path to amd-smi executable")
execute_process(
COMMAND ${AMD_SMI_EXECUTABLE} version
OUTPUT_VARIABLE AMD_SMI_VERSION_OUTPUT
ERROR_VARIABLE AMD_SMI_VERSION_ERROR
RESULT_VARIABLE AMD_SMI_RESULT
)
if(NOT AMD_SMI_RESULT EQUAL 0)
message(STATUS "- ${AMD_SMI_EXECUTABLE} not found. Disabling AMD-SMI support")
else()
string(REGEX MATCH "Library version: ([0-9]+)\\.([0-9]+)" _match "${AMD_SMI_VERSION_OUTPUT}")
if(CMAKE_MATCH_1)
set(AMD_SMI_MAJOR ${CMAKE_MATCH_1})
set(AMD_SMI_MINOR ${CMAKE_MATCH_2})
set(AMD_SMI_MIN_MAJOR 26)
set(AMD_SMI_MIN_MINOR 4)
if((AMD_SMI_MAJOR GREATER AMD_SMI_MIN_MAJOR) OR
(AMD_SMI_MAJOR EQUAL AMD_SMI_MIN_MAJOR AND (AMD_SMI_MINOR GREATER AMD_SMI_MIN_MINOR OR AMD_SMI_MINOR EQUAL AMD_SMI_MIN_MINOR)))
message(STATUS "- Detected amd-smi version ${AMD_SMI_MAJOR}.${AMD_SMI_MINOR} which has pod support")
set(AMD_SMI_FOUND 1)
else()
message(STATUS "- Detected amd-smi version ${AMD_SMI_MAJOR}.${AMD_SMI_MINOR} which does not have pod support")
message(STATUS "- Pod membership querying requires amd-smi version of at least ${AMD_SMI_MIN_MAJOR}.${AMD_SMI_MIN_MINOR}")
message(STATUS "- Pod membership may be forced in TransferBench by setting TB_FORCE_SINGLE_POD=1")
endif()
else()
message(WARNING "MPI not found. Please specify appropriate MPI_PATH or install MPI libraries (e.g., OpenMPI or MPICH)")
message(STATUS "- Could not parse amd-smi version. Disabling AMD-SMI support")
endif()
endif()
endif()

## Check for pod communication support
if(DEFINED ENV{DISABLE_POD_COMM} AND "$ENV{DISABLE_POD_COMM}" STREQUAL "1")
message(STATUS "Disabling pod communication support as env. flag DISABLE_POD_COMM was enabled")
elseif(NOT ENABLE_POD_COMM)
message(STATUS "For CMake builds, pod communication support requires explicit opt-in by setting CMake flag -DENABLE_POD_COMM=ON")
message(STATUS "- Disabling pod communication support")
else()
set(HIPCONFIG_EXECUTABLE "hipconfig" CACHE STRING "Path to hipconfig executable")
execute_process(
COMMAND ${HIPCONFIG_EXECUTABLE} --version
OUTPUT_VARIABLE HIP_VERSION_OUTPUT
ERROR_VARIABLE HIP_VERSION_ERROR
RESULT_VARIABLE HIPCONFIG_RESULT
)
if(NOT HIPCONFIG_RESULT EQUAL 0)
message(STATUS "- Unable to determine HIP version via ${HIPCONFIG_EXECUTABLE}. Try specifying path to hipconfig in HIPCONFIG_EXECUTABLE")
message(STATUS "- Disabling pod communication support")
else()
string(REGEX MATCH "([0-9]+)\\.([0-9]+)" _match "${HIP_VERSION_OUTPUT}")
if(CMAKE_MATCH_1)
set(HIP_MAJOR ${CMAKE_MATCH_1})
set(HIP_MINOR ${CMAKE_MATCH_2})
set(HIP_MIN_MAJOR 8)
set(HIP_MIN_MINOR 0)
if((HIP_MAJOR GREATER HIP_MIN_MAJOR) OR
(HIP_MAJOR EQUAL HIP_MIN_MAJOR AND (HIP_MINOR GREATER HIP_MIN_MINOR OR HIP_MINOR EQUAL HIP_MIN_MINOR)))
message(STATUS "- Detected HIP version ${HIP_MAJOR}.${HIP_MINOR} which has pod support")
set(POD_COMM_FOUND 1)
else()
message(STATUS "- Detected HIP version ${HIP_MAJOR}.${HIP_MINOR} which does not have pod support")
message(STATUS "- Pod support requires HIP version of at least ${HIP_MIN_MAJOR}.${HIP_MIN_MINOR}")
endif()
else()
message(STATUS "- Could not parse HIP version. Disabling pod communication support")
endif()
endif()
endif()
Expand All @@ -230,6 +324,9 @@ if(IBVERBS_FOUND)
target_include_directories(TransferBench PRIVATE ${IBVERBS_INCLUDE_DIR})
target_link_libraries(TransferBench PRIVATE ${IBVERBS_LIBRARY})
target_compile_definitions(TransferBench PRIVATE NIC_EXEC_ENABLED)
if(ENABLE_IBV_DIRECT)
Comment on lines 325 to +327
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ENABLE_IBV_DIRECT=OFF claims to resolve symbols via dlsym at runtime, but the target still links against ${IBVERBS_LIBRARY} unconditionally when IBVERBS_FOUND. This keeps libibverbs as a hard dependency and defeats optional loading. If runtime optionality is intended, avoid linking ${IBVERBS_LIBRARY} when ENABLE_IBV_DIRECT is OFF and ensure all ibv_* usage is through the loaded function pointers.

Suggested change
target_link_libraries(TransferBench PRIVATE ${IBVERBS_LIBRARY})
target_compile_definitions(TransferBench PRIVATE NIC_EXEC_ENABLED)
if(ENABLE_IBV_DIRECT)
target_compile_definitions(TransferBench PRIVATE NIC_EXEC_ENABLED)
if(ENABLE_IBV_DIRECT)
target_link_libraries(TransferBench PRIVATE ${IBVERBS_LIBRARY})

Copilot uses AI. Check for mistakes.
target_compile_definitions(TransferBench PRIVATE IBV_DIRECT=1)
endif()
endif()
if(MPI_COMM_FOUND)
if(TARGET MPI::MPI_CXX)
Expand All @@ -246,6 +343,13 @@ endif()
if(DMABUF_SUPPORT_FOUND)
target_compile_definitions(TransferBench PRIVATE HAVE_DMABUF_SUPPORT)
endif()
if(AMD_SMI_FOUND)
target_compile_definitions(TransferBench PRIVATE AMD_SMI_ENABLED)
endif()
if(POD_COMM_FOUND)
target_compile_definitions(TransferBench PRIVATE POD_COMM_ENABLED)
endif()

if (HAVE_PARALLEL_JOBS)
target_compile_options(TransferBench PRIVATE -parallel-jobs=12)
endif()
Expand Down
Loading