Skip to content

Restructure iris into host/device split, delete IrisGluon#515

Open
mawad-amd wants to merge 16 commits intomainfrom
muhaawad/refactor
Open

Restructure iris into host/device split, delete IrisGluon#515
mawad-amd wants to merge 16 commits intomainfrom
muhaawad/refactor

Conversation

@mawad-amd
Copy link
Copy Markdown
Collaborator

Summary

  • Restructure the iris package from a flat layout into a clear host/ and device/ split
  • Unify Iris and IrisGluon into a single Iris host class — all host-side code duplication is gone
  • Move gluon device code out of experimental/ into iris/device/gluon/
  • Add iris.gluon shortcut module (from iris.gluon import IrisDeviceCtx)
  • Consolidate duplicate docs into docs/reference/host/ (one source of truth for the unified host API)
  • Delete 21 superseded files, update all imports across tests, examples, CCL, ops, and docs

New directory structure

iris/
├── host/           # Host-side (Python, no JIT)
│   ├── iris.py     # Unified Iris class (was iris.py + iris_gluon.py)
│   ├── memory/     # SymmetricHeap, tensors, allocators
│   ├── distributed/# helpers, topology, fd_passing
│   ├── tracing/    # Tracing, TraceEvent, kernel_artifacts
│   ├── logging/
│   └── platform/   # hip, utils
├── device/         # Device-side (JIT kernels)
│   ├── utils.py    # Shared intrinsics
│   ├── triton/     # DeviceContext, ops (load/store/atomic_*)
│   └── gluon/      # IrisDeviceCtx, GluonDeviceTracing
├── gluon.py        # Shortcut: from iris.gluon import IrisDeviceCtx
├── ccl/            # Unchanged
├── ops/            # Unchanged
└── x/              # Unchanged

Test plan

  • All imports verified on GPU node (MI308X)
  • torchrun --nproc_per_node=4 examples/25_ccl_all_gather/example.py --validate (Triton)
  • torchrun --nproc_per_node=4 examples/25_ccl_all_gather/example.py --validate --use_gluon (Gluon)
  • torchrun --nproc_per_node=4 examples/24_ccl_all_reduce/example.py --validate
  • Full unit test suite (78/78 pass, pre-existing dmabuf/vmem skips unchanged)
  • ruff check passes
  • CI

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings April 22, 2026 07:05
@mawad-amd mawad-amd requested review from BKP and neoblizz as code owners April 22, 2026 07:05
@github-actions github-actions Bot added in-progress We are working on it iris Iris project issue labels Apr 22, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR restructures the iris package into a host/device split, unifies the host API into a single Iris class, and moves Gluon device code out of experimental/ while updating imports across code, tests, examples, and docs.

Changes:

  • Introduce iris/host/* (host-side, non-JIT) and iris/device/* (device-side kernels) with updated public re-exports.
  • Replace iris.experimental.iris_gluon usage with iris.gluon/iris.device.gluon and update tests/examples accordingly.
  • Consolidate and reorganize Sphinx reference docs to reflect unified host API and separate Triton/Gluon device APIs.

Reviewed changes

Copilot reviewed 88 out of 94 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/unittests/test_store_gluon.py Update Gluon unit test imports to iris + iris.gluon.IrisDeviceCtx.
tests/unittests/test_put_gluon.py Same import migration for PUT Gluon unit test.
tests/unittests/test_load_gluon.py Same import migration for LOAD Gluon unit test.
tests/unittests/test_gluon_cache_modifiers.py Update imports and IrisDeviceCtx reference for cache-modifier tests.
tests/unittests/test_get_gluon.py Same import migration for GET Gluon unit test.
tests/unittests/test_device_context_gluon.py Update imports to new tracing/events location and iris.gluon.
tests/unittests/test_copy_gluon.py Same import migration for COPY Gluon unit test.
tests/unittests/test_broadcast_gluon.py Switch host context import to unified iris.iris().
tests/unittests/test_atomic_xor_gluon.py Same import migration for atomic XOR Gluon unit test.
tests/unittests/test_atomic_xchg_gluon.py Same import migration for atomic XCHG Gluon unit test.
tests/unittests/test_atomic_or_gluon.py Same import migration for atomic OR Gluon unit test.
tests/unittests/test_atomic_min_gluon.py Same import migration for atomic MIN Gluon unit test.
tests/unittests/test_atomic_max_gluon.py Same import migration for atomic MAX Gluon unit test.
tests/unittests/test_atomic_cas_gluon.py Same import migration for atomic CAS Gluon unit test.
tests/unittests/test_atomic_and_gluon.py Same import migration for atomic AND Gluon unit test.
tests/unittests/test_atomic_add_gluon.py Same import migration for atomic ADD Gluon unit test.
tests/ccl/test_all_to_all_gluon.py Update CCL Gluon test to use unified iris.iris().
tests/ccl/test_all_gather_gluon.py Update CCL Gluon test to use unified iris.iris().
iris/x/reduce_scatter.py Update DeviceContext import path to new Triton device context module.
iris/x/gather.py Update DeviceContext import path to new Triton device context module.
iris/x/all_to_all.py Update DeviceContext import path to new Triton device context module.
iris/x/all_reduce.py Update DeviceContext import path to new Triton device context module.
iris/x/all_gather.py Update DeviceContext import path to new Triton device context module.
iris/util.py Update copyright year range.
iris/tracing/events.py Add SPDX/copyright header to tracing events module.
iris/tracing/device.py Update import to new device utils location.
iris/tracing/core.py Add SPDX/copyright header and update HIP import to host platform module.
iris/tracing/init.py Remove old tracing package re-exports.
iris/tensor_utils.py Add SPDX/copyright header.
iris/tensor_creation.py Update doc references and host import paths (logger + sim env helper).
iris/symmetric_heap.py Update imports to new host memory/distributed/platform modules.
iris/ops/matmul_reduce_scatter.py Update tracing kernel artifact import to host tracing module.
iris/ops/matmul_all_reduce.py Update tracing kernel artifact import to host tracing module.
iris/ops/matmul_all_gather.py Update tracing kernel artifact import to host tracing module.
iris/ops/all_gather_matmul.py Update tracing kernel artifact import to host tracing module.
iris/logging.py Update copyright year range.
iris/host/tracing/init.py Add host tracing package initializer and re-exports.
iris/host/platform/init.py Add host platform package initializer.
iris/host/memory/allocators/init.py Add host allocator package initializer and re-exports.
iris/host/memory/init.py Add host memory package initializer.
iris/host/logging/init.py Add host logging package initializer and re-exports.
iris/host/iris.py Introduce unified host Iris class in new host package.
iris/host/distributed/init.py Add host distributed package initializer.
iris/host/init.py Add host package initializer.
iris/gluon.py Add iris.gluon shortcut module re-exporting Gluon device context/tracing.
iris/fd_passing.py Update distributed helper import to new host distributed module.
iris/experimental/iris_gluon.py Delete legacy IrisGluon implementation.
iris/experimental/init.py Update experimental docs and re-export to new device Gluon modules.
iris/device/triton/ops.py Add Triton device-side functional RMA API under new device package.
iris/device/triton/context.py Add Triton device-side OO DeviceContext under new device package.
iris/device/triton/init.py Add Triton device package initializer exporting context + ops.
iris/device/gluon/tracing.py Add Gluon device-side tracing under new device package.
iris/device/gluon/context.py Add Gluon device-side IrisDeviceCtx under new device package.
iris/device/gluon/init.py Add Gluon device package initializer exporting context + tracing.
iris/device/init.py Add device package initializer.
iris/ccl/utils.py Update group-info helper import to new host distributed module.
iris/ccl/reduce_scatter.py Update tracing kernel artifact import to host tracing module.
iris/ccl/all_to_all.py Update tracing import + Gluon IrisDeviceCtx import path and error message.
iris/ccl/all_reduce.py Update tracing kernel artifact import to host tracing module.
iris/ccl/all_gather.py Update tracing import + Gluon IrisDeviceCtx import path and error message.
iris/bench/_runner.py Remove separate Gluon host context creation path (always use unified iris.iris).
iris/bench/_core.py Update benchmark docstring wording around --use_gluon.
iris/allocators/vmem_allocator.py Update HIP import path to new host platform module.
iris/allocators/torch_allocator.py Update HIP/fd_passing/is_simulation_env import paths to host modules.
iris/allocators/init.py Remove old allocators package initializer (replaced by host path).
iris/_distributed_helpers.py Update tracing kernel artifact import to host tracing module.
iris/init.py Update top-level public API re-exports to new host/device module locations.
examples/32_gluon_all_gather_tracing/all_gather_tracing.py Update example imports to unified iris + iris.gluon.
examples/25_ccl_all_gather/example.py Remove separate Gluon host context creation path (always use unified iris.iris).
examples/06_message_passing/message_passing_gluon.py Update example imports and IrisDeviceCtx usage to new shortcut module.
docs/sphinx/_toc.yml.in Update navigation to new consolidated host reference docs.
docs/sphinx/_toc.yml Update navigation to new consolidated host reference docs.
docs/reference/triton/tensor-creation.md Remove superseded triton host API doc page (moved to host reference).
docs/reference/triton/device-functions.md Update autodoc targets to new Triton device ops module.
docs/reference/triton/class.md Remove superseded triton host API doc page (moved to host reference).
docs/reference/triton/ccl.md Remove superseded triton host API doc page (moved to host reference).
docs/reference/host/tensor-creation.md Add consolidated host tensor-creation reference page.
docs/reference/host/class.md Add consolidated host Iris class reference page.
docs/reference/host/ccl.md Add consolidated host CCL reference page.
docs/reference/gluon/tensor-creation.md Remove superseded Gluon host API doc page (host API unified).
docs/reference/gluon/overview.md Update Gluon docs to use unified host context and iris.gluon shortcut.
docs/reference/gluon/device-functions.md Update autodoc targets to new Gluon device context module.
docs/reference/gluon/class.md Remove superseded Gluon host API doc page (host API unified).
docs/reference/gluon/ccl.md Remove superseded Gluon host API doc page (host API unified).
docs/reference/api-reference.md Rework API reference landing page into Host/Triton/Gluon sections.
docs/index.md Update docs front page example to unified host context and iris.gluon.
docs/conf.py Update autodoc settings and mocked imports to match new module layout.
README.md Update README example to unified host context and iris.gluon.
Comments suppressed due to low confidence (3)

iris/device/triton/ops.py:1

  • translated_dst is cast using src_ptr.dtype rather than dst_ptr.dtype. If src_ptr and dst_ptr differ in pointer element type (even accidentally), this will produce an incorrectly typed destination pointer and can miscompile or mis-store. Cast translated_dst to dst_ptr.dtype instead.
    iris/device/triton/context.py:1
  • Same issue as in iris.device.triton.ops.copy: translated_dst should be cast with dst_ptr.dtype, not src_ptr.dtype, to keep the translated destination pointer type-correct.
    iris/device/triton/ops.py:1
  • The load() docstring describes pointer as being in the from_rank address space, but the implementation translates with __translate(pointer, to_rank, from_rank, ...), which implies pointer is expressed in the to_rank (local/current) address space. Please update the parameter docs to match the actual translation direction to avoid incorrect usage by kernel authors.

Comment thread docs/sphinx/_toc.yml
Comment thread docs/reference/host/class.md Outdated
Comment thread iris/ccl/all_to_all.py Outdated
mawad-amd and others added 9 commits April 22, 2026 00:21
Reorganize the iris package from a flat directory structure into a clean
host-side vs device-side separation. The host/ directory contains the
unified Iris class, memory management, distributed helpers, tracing,
logging, and platform utilities. The device/ directory contains Triton
and Gluon device-side contexts and operations.

Key changes:
- Split iris.py (2687 lines) into host/iris.py (~1300 lines),
  device/triton/context.py (~685 lines), device/triton/ops.py (~686 lines)
- Move IrisDeviceCtx and GluonDeviceTracing from experimental/ to
  device/gluon/ (no longer experimental)
- Group loose files into logical subdirs: memory/, distributed/, tracing/,
  logging/, platform/
- Fix _build_device_context no-tracing padding from [0] to [0]*17 for
  safe decoding by gluon kernels compiled with tracing=True
- Old file paths remain as thin re-export shims for backward compat
- ccl/, ops/, x/ modules are unchanged (follow-up work)

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Remove backward-compat shims — all consumers (ccl/, ops/, x/,
experimental/) now import directly from the new host/device paths.
Delete 21 files that were superseded by the restructure. Update
copyright headers to 2025-2026.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Update tests, examples, docs, bench, and CCL error messages to use
new import paths directly. No backward-compat shim — iris_gluon.py
is gone and IrisGluon is replaced by the unified Iris host class.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Add iris/gluon.py re-export module so users can write
  `from iris.gluon import IrisDeviceCtx` instead of the verbose
  `from iris.device.gluon.context import IrisDeviceCtx`
- Update all tests, examples, and docs to use the short path
- Consolidate duplicate host-side doc pages (class, tensor-creation,
  CCL) into docs/reference/host/ since the Iris host class is unified
- Keep backend-specific device-side pages under triton/ and gluon/

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Update iris.hip, iris.device_utils, iris.topology, iris.fd_passing,
iris.logging, and iris._distributed_helpers references to new paths
in tests/unittests, tests/examples, and legacy examples.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Remove get_backend from docs (was an IrisGluon-only legacy method,
  not present on unified Iris class)
- Update CCL error messages to not reference "Iris Gluon context"

Co-Authored-By: Claude Opus 4.6 <[email protected]>
With a unified Iris host class, get_device_context() always exists.
Remove the dead hasattr guard and "Iris Gluon context" wording.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
mawad-amd and others added 7 commits April 22, 2026 00:29
- Add triton.language.extra and triton.language.extra.hip to sphinx
  module mocks so iris.device.utils imports resolve during doc build
- Fix relative cross-references in triton/overview.md and
  gluon/overview.md to point to consolidated host/ pages

Co-Authored-By: Claude Opus 4.6 <[email protected]>
iris.device.utils imports triton.language.target_info which needs
to be mocked alongside triton.language.extra.hip for sphinx autodoc
to resolve device module imports.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
The gluon broadcast test used the old IrisGluon parameter name
(src_rank). The unified Iris.broadcast() uses source_rank.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
torch.distributed.broadcast uses `src` as the parameter name.
Align Iris.broadcast() to match for consistency.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
broadcast() is a host-side method — no kernel involved, no reason
to have separate triton/gluon test files. Keep one as test_broadcast.py.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
The old iris/iris.py module no longer exists. Update the import path
test to verify DeviceContext is importable from its new location.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
test_rocr_behaviors.py still referenced the old flat path for
VMemAllocator.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

in-progress We are working on it iris Iris project issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants