[OMNIML-3252][ONNX] MOQ + Autotune moq integration docs#1026
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughDocumentation-only edits: renaming "AutoQDQ" to "Autotune" in guides/changelog, minor grammar fix in CNN QAT README, and addition of an "Optimize Q/DQ node placement with Autotune" section (CLI example and API link) to the ONNX PTQ README. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan for PR comments
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
examples/onnx/README.md (1)
227-236: Improve documentation clarity for--autotuneusage.Both
--autotune(without value) and--autotune=<quick|default|extensive>are valid. The parser accepts the flag without a value and defaults to "default" mode. Consider updating the prose to clarify that--autotunealone enables autotune with the default mode, and the example shows how to explicitly specify an alternative mode.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/onnx/README.md` around lines 227 - 236, Update the README text around the ONNX quantization CLI example to clarify that the --autotune flag can be provided with no value (which enables autotune in "default" mode) or with an explicit mode (e.g., --autotune=quick|default|extensive); specifically modify the prose near the python -m modelopt.onnx.quantization invocation and the --autotune usage line so it states that using --autotune alone equals --autotune=default and show the explicit form for selecting quick or extensive modes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/cnn_qat/README.md`:
- Line 146: Replace the incorrect phrase "deploying a ONNX PTQ" with the
corrected phrase "deploying an ONNX PTQ model" in the README sentence that
compares QAT export to ONNX PTQ deployment; locate the sentence containing
"deploying a ONNX PTQ" and update it to read "...similar to deploying an ONNX
PTQ model from ModelOpt."
---
Nitpick comments:
In `@examples/onnx/README.md`:
- Around line 227-236: Update the README text around the ONNX quantization CLI
example to clarify that the --autotune flag can be provided with no value (which
enables autotune in "default" mode) or with an explicit mode (e.g.,
--autotune=quick|default|extensive); specifically modify the prose near the
python -m modelopt.onnx.quantization invocation and the --autotune usage line so
it states that using --autotune alone equals --autotune=default and show the
explicit form for selecting quick or extensive modes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d25531fc-0e08-447d-a76c-5f7b632da19c
📥 Commits
Reviewing files that changed from the base of the PR and between 34a9fc7 and 68171c80d329fc7bf81534d98cb44b218aed4f4e.
📒 Files selected for processing (22)
docs/source/guides/9_autotune.rstexamples/cnn_qat/README.mdexamples/onnx/README.mdexamples/onnx/autotune/README.mdexamples/onnx/download_example_onnx.pyexamples/onnx/evaluate.pyexamples/onnx/evaluation.pyexamples/onnx/image_prep.pyexamples/onnx/requirements.txttests/gpu/onnx/quantization/test_concat_elim.pytests/gpu/onnx/quantization/test_plugin.pytests/gpu/onnx/quantization/test_qdq_utils_fp8.pytests/gpu/onnx/quantization/test_quantize_fp8.pytests/gpu/onnx/quantization/test_quantize_onnx_torch_int4_awq.pytests/unit/onnx/quantization/test_convtranspose_qdq.pytests/unit/onnx/quantization/test_dq_transpose_surgery.pytests/unit/onnx/quantization/test_qdq_rules_int8.pytests/unit/onnx/quantization/test_qdq_utils.pytests/unit/onnx/quantization/test_quant_utils.pytests/unit/onnx/quantization/test_quantize_api.pytests/unit/onnx/quantization/test_quantize_int8.pytests/unit/onnx/quantization/test_quantize_zint4.py
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1026 +/- ##
==========================================
+ Coverage 70.11% 70.12% +0.01%
==========================================
Files 221 221
Lines 25459 25459
==========================================
+ Hits 17851 17854 +3
+ Misses 7608 7605 -3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
qq what's the context for onnx_ptq -> onnx change under the examples/ dir? |
This is related to your request in #841 (comment) to reduce top-level folders. I've renamed this back to |
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
57f8b58 to
56d7065
Compare
### What does this PR do? **Type of change**: documentation **Overview**: This PR updates the documentation and does some folder re-structuring and file re-naming related to NVIDIA#951. ### Usage Documentation ### Testing Documentation ### Before your PR is "*Ready for review*" - Is this change backward compatible?: ✅ - If you copied code from any other sources or added a new PIP dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ - Did you write any new necessary tests?: N/A <!--- Mandatory for new features or examples. --> - Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?: ✅ (renamed `AutoQDQ` to `Autotune`) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Renamed AutoQDQ to Autotune across guides and changelog. * Updated Autotune guide descriptions and wording. * Added a new section on optimizing Q/DQ node placement with Autotune, including CLI usage and API links (appears twice in one README). * Applied minor grammar and capitalization corrections. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
What does this PR do?
Type of change: documentation
Overview: This PR updates the documentation and does some folder re-structuring and file re-naming related to #951.
Usage
Documentation
Testing
Documentation
Before your PR is "Ready for review"
CONTRIBUTING.md: ✅AutoQDQtoAutotune)Summary by CodeRabbit