[OMNIML-3252][ONNX] MOQ + Autotune moq integration docs by gcunhase · Pull Request #1026 · NVIDIA/Model-Optimizer

gcunhase · 2026-03-11T18:49:33Z

What does this PR do?

Type of change: documentation

Overview: This PR updates the documentation and does some folder re-structuring and file re-naming related to #951.

Usage

Documentation

Testing

Documentation

Before your PR is "Ready for review"

Is this change backward compatible?: ✅
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅
Did you write any new necessary tests?: N/A
Did you update Changelog?: ✅ (renamed AutoQDQ to Autotune)

Summary by CodeRabbit

Documentation
- Renamed AutoQDQ to Autotune across guides and changelog.
- Updated Autotune guide descriptions and wording.
- Added a new section on optimizing Q/DQ node placement with Autotune, including CLI usage and API links (appears twice in one README).
- Applied minor grammar and capitalization corrections.

coderabbitai · 2026-03-11T18:49:56Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Documentation-only edits: renaming "AutoQDQ" to "Autotune" in guides/changelog, minor grammar fix in CNN QAT README, and addition of an "Optimize Q/DQ node placement with Autotune" section (CLI example and API link) to the ONNX PTQ README.

Changes

Cohort / File(s)	Summary
Autotune Guide `docs/source/guides/9_autotune.rst`	Title changed to "Autotune (ONNX)"; phrasing around Q/DQ placement optimization updated.
ONNX PTQ README additions `examples/onnx_ptq/README.md`	New "Optimize Q/DQ node placement with Autotune" section added (includes `--autotune` CLI example and link to API guide).
Minor docs fix `examples/cnn_qat/README.md`, `CHANGELOG.rst`	Grammar/capitalization fix in CNN QAT README; CHANGELOG entry renamed from "AutoQDQ" to "Autotune" and wording updated.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title references MOQ and Autotune documentation updates, which aligns with the primary changes across multiple documentation files (guides, READMEs, CHANGELOG) related to renaming AutoQDQ to Autotune and updating related documentation.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Security Anti-Patterns	✅ Passed	PR contains only documentation changes (.rst and .md files) with no Python code modifications to modelopt package or examples, making security review inapplicable.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📝 Coding Plan for PR comments

Generate coding plan

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

examples/onnx/README.md (1)
227-236: Improve documentation clarity for --autotune usage.

Both --autotune (without value) and --autotune=<quick|default|extensive> are valid. The parser accepts the flag without a value and defaults to "default" mode. Consider updating the prose to clarify that --autotune alone enables autotune with the default mode, and the example shows how to explicitly specify an alternative mode.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/onnx/README.md` around lines 227 - 236, Update the README text
around the ONNX quantization CLI example to clarify that the --autotune flag can
be provided with no value (which enables autotune in "default" mode) or with an
explicit mode (e.g., --autotune=quick|default|extensive); specifically modify
the prose near the python -m modelopt.onnx.quantization invocation and the
--autotune usage line so it states that using --autotune alone equals
--autotune=default and show the explicit form for selecting quick or extensive
modes.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/cnn_qat/README.md`:
- Line 146: Replace the incorrect phrase "deploying a ONNX PTQ" with the
corrected phrase "deploying an ONNX PTQ model" in the README sentence that
compares QAT export to ONNX PTQ deployment; locate the sentence containing
"deploying a ONNX PTQ" and update it to read "...similar to deploying an ONNX
PTQ model from ModelOpt."

---

Nitpick comments:
In `@examples/onnx/README.md`:
- Around line 227-236: Update the README text around the ONNX quantization CLI
example to clarify that the --autotune flag can be provided with no value (which
enables autotune in "default" mode) or with an explicit mode (e.g.,
--autotune=quick|default|extensive); specifically modify the prose near the
python -m modelopt.onnx.quantization invocation and the --autotune usage line so
it states that using --autotune alone equals --autotune=default and show the
explicit form for selecting quick or extensive modes.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d25531fc-0e08-447d-a76c-5f7b632da19c

📥 Commits

Reviewing files that changed from the base of the PR and between 34a9fc7 and 68171c80d329fc7bf81534d98cb44b218aed4f4e.

📒 Files selected for processing (22)

docs/source/guides/9_autotune.rst
examples/cnn_qat/README.md
examples/onnx/README.md
examples/onnx/autotune/README.md
examples/onnx/download_example_onnx.py
examples/onnx/evaluate.py
examples/onnx/evaluation.py
examples/onnx/image_prep.py
examples/onnx/requirements.txt
tests/gpu/onnx/quantization/test_concat_elim.py
tests/gpu/onnx/quantization/test_plugin.py
tests/gpu/onnx/quantization/test_qdq_utils_fp8.py
tests/gpu/onnx/quantization/test_quantize_fp8.py
tests/gpu/onnx/quantization/test_quantize_onnx_torch_int4_awq.py
tests/unit/onnx/quantization/test_convtranspose_qdq.py
tests/unit/onnx/quantization/test_dq_transpose_surgery.py
tests/unit/onnx/quantization/test_qdq_rules_int8.py
tests/unit/onnx/quantization/test_qdq_utils.py
tests/unit/onnx/quantization/test_quant_utils.py
tests/unit/onnx/quantization/test_quantize_api.py
tests/unit/onnx/quantization/test_quantize_int8.py
tests/unit/onnx/quantization/test_quantize_zint4.py

examples/cnn_qat/README.md

codecov · 2026-03-11T19:02:26Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.12%. Comparing base (72a5b3d) to head (56d7065).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1026      +/-   ##
==========================================
+ Coverage   70.11%   70.12%   +0.01%     
==========================================
  Files         221      221              
  Lines       25459    25459              
==========================================
+ Hits        17851    17854       +3     
+ Misses       7608     7605       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

cjluo-nv · 2026-03-11T19:19:18Z

qq what's the context for onnx_ptq -> onnx change under the examples/ dir?

gcunhase · 2026-03-11T19:33:46Z

qq what's the context for onnx_ptq -> onnx change under the examples/ dir?

This is related to your request in #841 (comment) to reduce top-level folders. I've renamed this back to onnx_ptq and moved the autotune folder into it. Let me know if that looks good to you. Thanks!

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

### What does this PR do? **Type of change**: documentation **Overview**: This PR updates the documentation and does some folder re-structuring and file re-naming related to NVIDIA#951. ### Usage Documentation ### Testing Documentation ### Before your PR is "*Ready for review*" - Is this change backward compatible?: ✅ - If you copied code from any other sources or added a new PIP dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ - Did you write any new necessary tests?: N/A  - Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?: ✅ (renamed `AutoQDQ` to `Autotune`)  ## Summary by CodeRabbit * **Documentation** * Renamed AutoQDQ to Autotune across guides and changelog. * Updated Autotune guide descriptions and wording. * Added a new section on optimizing Q/DQ node placement with Autotune, including CLI usage and API links (appears twice in one README). * Applied minor grammar and capitalization corrections.  --------- Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

gcunhase requested review from a team as code owners March 11, 2026 18:49

gcunhase requested review from galagam and realAsma March 11, 2026 18:49

gcunhase requested a review from ajrasane March 11, 2026 18:51

coderabbitai bot reviewed Mar 11, 2026

View reviewed changes

examples/cnn_qat/README.md Outdated Show resolved Hide resolved

gcunhase requested review from cjluo-nv and removed request for galagam March 11, 2026 19:38

ajrasane approved these changes Mar 12, 2026

View reviewed changes

cjluo-nv approved these changes Mar 12, 2026

View reviewed changes

galagam approved these changes Mar 12, 2026

View reviewed changes

gcunhase enabled auto-merge (squash) March 12, 2026 16:19

gcunhase added 6 commits March 12, 2026 12:31

Re-structured folders and re-named files

36e8a97

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

Add --autotune doc

8732721

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

nit: update path

1a43bfd

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

nit: a->an

842e2ec

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

Renamed onnx->onnx_ptq

3fea258

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

Renamed AutoQDQ->Autotune

56d7065

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>

gcunhase force-pushed the dev/gcunhasergio/autotune_moq_integration_docs branch from 57f8b58 to 56d7065 Compare March 12, 2026 16:31

gcunhase disabled auto-merge March 12, 2026 17:07

gcunhase enabled auto-merge (squash) March 12, 2026 17:09

realAsma approved these changes Mar 12, 2026

View reviewed changes

gcunhase merged commit 69c0d47 into NVIDIA:main Mar 12, 2026
40 checks passed

Hale423 mentioned this pull request Mar 16, 2026

[WIP][Feature Request] Support ONNX Q/DQ Autotuning with Subgraph Mode #1015

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OMNIML-3252][ONNX] MOQ + Autotune moq integration docs#1026

[OMNIML-3252][ONNX] MOQ + Autotune moq integration docs#1026
gcunhase merged 6 commits intoNVIDIA:mainfrom
gcunhase:dev/gcunhasergio/autotune_moq_integration_docs

gcunhase commented Mar 11, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 11, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

codecov bot commented Mar 11, 2026 •

edited

Loading

Uh oh!

cjluo-nv commented Mar 11, 2026

Uh oh!

gcunhase commented Mar 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

gcunhase commented Mar 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cjluo-nv commented Mar 11, 2026

Uh oh!

gcunhase commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

gcunhase commented Mar 11, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 11, 2026 •

edited

Loading

codecov bot commented Mar 11, 2026 •

edited

Loading

gcunhase commented Mar 11, 2026 •

edited

Loading