Skip to content

[OMNIML-3252][ONNX] MOQ + Autotune moq integration docs#1026

Merged
gcunhase merged 6 commits intoNVIDIA:mainfrom
gcunhase:dev/gcunhasergio/autotune_moq_integration_docs
Mar 12, 2026
Merged

[OMNIML-3252][ONNX] MOQ + Autotune moq integration docs#1026
gcunhase merged 6 commits intoNVIDIA:mainfrom
gcunhase:dev/gcunhasergio/autotune_moq_integration_docs

Conversation

@gcunhase
Copy link
Contributor

@gcunhase gcunhase commented Mar 11, 2026

What does this PR do?

Type of change: documentation

Overview: This PR updates the documentation and does some folder re-structuring and file re-naming related to #951.

Usage

Documentation

Testing

Documentation

Before your PR is "Ready for review"

  • Is this change backward compatible?: ✅
  • If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅
  • Did you write any new necessary tests?: N/A
  • Did you update Changelog?: ✅ (renamed AutoQDQ to Autotune)

Summary by CodeRabbit

  • Documentation
    • Renamed AutoQDQ to Autotune across guides and changelog.
    • Updated Autotune guide descriptions and wording.
    • Added a new section on optimizing Q/DQ node placement with Autotune, including CLI usage and API links (appears twice in one README).
    • Applied minor grammar and capitalization corrections.

@gcunhase gcunhase requested review from a team as code owners March 11, 2026 18:49
@gcunhase gcunhase requested review from galagam and realAsma March 11, 2026 18:49
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Documentation-only edits: renaming "AutoQDQ" to "Autotune" in guides/changelog, minor grammar fix in CNN QAT README, and addition of an "Optimize Q/DQ node placement with Autotune" section (CLI example and API link) to the ONNX PTQ README.

Changes

Cohort / File(s) Summary
Autotune Guide
docs/source/guides/9_autotune.rst
Title changed to "Autotune (ONNX)"; phrasing around Q/DQ placement optimization updated.
ONNX PTQ README additions
examples/onnx_ptq/README.md
New "Optimize Q/DQ node placement with Autotune" section added (includes --autotune CLI example and link to API guide).
Minor docs fix
examples/cnn_qat/README.md, CHANGELOG.rst
Grammar/capitalization fix in CNN QAT README; CHANGELOG entry renamed from "AutoQDQ" to "Autotune" and wording updated.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title references MOQ and Autotune documentation updates, which aligns with the primary changes across multiple documentation files (guides, READMEs, CHANGELOG) related to renaming AutoQDQ to Autotune and updating related documentation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Security Anti-Patterns ✅ Passed PR contains only documentation changes (.rst and .md files) with no Python code modifications to modelopt package or examples, making security review inapplicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan for PR comments
  • Generate coding plan

Comment @coderabbitai help to get the list of available commands and usage tips.

@gcunhase gcunhase requested a review from ajrasane March 11, 2026 18:51
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
examples/onnx/README.md (1)

227-236: Improve documentation clarity for --autotune usage.

Both --autotune (without value) and --autotune=<quick|default|extensive> are valid. The parser accepts the flag without a value and defaults to "default" mode. Consider updating the prose to clarify that --autotune alone enables autotune with the default mode, and the example shows how to explicitly specify an alternative mode.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/onnx/README.md` around lines 227 - 236, Update the README text
around the ONNX quantization CLI example to clarify that the --autotune flag can
be provided with no value (which enables autotune in "default" mode) or with an
explicit mode (e.g., --autotune=quick|default|extensive); specifically modify
the prose near the python -m modelopt.onnx.quantization invocation and the
--autotune usage line so it states that using --autotune alone equals
--autotune=default and show the explicit form for selecting quick or extensive
modes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/cnn_qat/README.md`:
- Line 146: Replace the incorrect phrase "deploying a ONNX PTQ" with the
corrected phrase "deploying an ONNX PTQ model" in the README sentence that
compares QAT export to ONNX PTQ deployment; locate the sentence containing
"deploying a ONNX PTQ" and update it to read "...similar to deploying an ONNX
PTQ model from ModelOpt."

---

Nitpick comments:
In `@examples/onnx/README.md`:
- Around line 227-236: Update the README text around the ONNX quantization CLI
example to clarify that the --autotune flag can be provided with no value (which
enables autotune in "default" mode) or with an explicit mode (e.g.,
--autotune=quick|default|extensive); specifically modify the prose near the
python -m modelopt.onnx.quantization invocation and the --autotune usage line so
it states that using --autotune alone equals --autotune=default and show the
explicit form for selecting quick or extensive modes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d25531fc-0e08-447d-a76c-5f7b632da19c

📥 Commits

Reviewing files that changed from the base of the PR and between 34a9fc7 and 68171c80d329fc7bf81534d98cb44b218aed4f4e.

📒 Files selected for processing (22)
  • docs/source/guides/9_autotune.rst
  • examples/cnn_qat/README.md
  • examples/onnx/README.md
  • examples/onnx/autotune/README.md
  • examples/onnx/download_example_onnx.py
  • examples/onnx/evaluate.py
  • examples/onnx/evaluation.py
  • examples/onnx/image_prep.py
  • examples/onnx/requirements.txt
  • tests/gpu/onnx/quantization/test_concat_elim.py
  • tests/gpu/onnx/quantization/test_plugin.py
  • tests/gpu/onnx/quantization/test_qdq_utils_fp8.py
  • tests/gpu/onnx/quantization/test_quantize_fp8.py
  • tests/gpu/onnx/quantization/test_quantize_onnx_torch_int4_awq.py
  • tests/unit/onnx/quantization/test_convtranspose_qdq.py
  • tests/unit/onnx/quantization/test_dq_transpose_surgery.py
  • tests/unit/onnx/quantization/test_qdq_rules_int8.py
  • tests/unit/onnx/quantization/test_qdq_utils.py
  • tests/unit/onnx/quantization/test_quant_utils.py
  • tests/unit/onnx/quantization/test_quantize_api.py
  • tests/unit/onnx/quantization/test_quantize_int8.py
  • tests/unit/onnx/quantization/test_quantize_zint4.py

@codecov
Copy link

codecov bot commented Mar 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.12%. Comparing base (72a5b3d) to head (56d7065).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1026      +/-   ##
==========================================
+ Coverage   70.11%   70.12%   +0.01%     
==========================================
  Files         221      221              
  Lines       25459    25459              
==========================================
+ Hits        17851    17854       +3     
+ Misses       7608     7605       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cjluo-nv
Copy link
Collaborator

qq what's the context for onnx_ptq -> onnx change under the examples/ dir?

@gcunhase
Copy link
Contributor Author

gcunhase commented Mar 11, 2026

qq what's the context for onnx_ptq -> onnx change under the examples/ dir?

This is related to your request in #841 (comment) to reduce top-level folders. I've renamed this back to onnx_ptq and moved the autotune folder into it. Let me know if that looks good to you. Thanks!

@gcunhase gcunhase requested review from cjluo-nv and removed request for galagam March 11, 2026 19:38
@gcunhase gcunhase enabled auto-merge (squash) March 12, 2026 16:19
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
@gcunhase gcunhase force-pushed the dev/gcunhasergio/autotune_moq_integration_docs branch from 57f8b58 to 56d7065 Compare March 12, 2026 16:31
@gcunhase gcunhase disabled auto-merge March 12, 2026 17:07
@gcunhase gcunhase enabled auto-merge (squash) March 12, 2026 17:09
@gcunhase gcunhase merged commit 69c0d47 into NVIDIA:main Mar 12, 2026
40 checks passed
DrXuQian pushed a commit to DrXuQian/Model-Optimizer that referenced this pull request Mar 13, 2026
### What does this PR do?

**Type of change**: documentation

**Overview**: This PR updates the documentation and does some folder
re-structuring and file re-naming related to
NVIDIA#951.

### Usage

Documentation

### Testing

Documentation

### Before your PR is "*Ready for review*"

- Is this change backward compatible?: ✅
- If you copied code from any other sources or added a new PIP
dependency, did you follow guidance in `CONTRIBUTING.md`: ✅
- Did you write any new necessary tests?: N/A <!--- Mandatory for new
features or examples. -->
- Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?:
✅ (renamed `AutoQDQ` to `Autotune`)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
  * Renamed AutoQDQ to Autotune across guides and changelog.
  * Updated Autotune guide descriptions and wording.
* Added a new section on optimizing Q/DQ node placement with Autotune,
including CLI usage and API links (appears twice in one README).
  * Applied minor grammar and capitalization corrections.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants