[1/N] Refactor llm_qat example: YAML configs + ModelOptArgParser#1172
[1/N] Refactor llm_qat example: YAML configs + ModelOptArgParser#1172
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughSplits the monolithic llm_qat launcher into modular Changes
Sequence Diagram(s)mermaid mermaid Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
tests/unit/torch/opt/plugins/test_modelopt_arg_parser.py (1)
70-79: Consider adding error-handling tests.The empty YAML test is good. Consider adding tests for:
- Non-existent
--configfile path (should raise appropriate error)- Malformed YAML syntax (should raise parse error, not crash)
Example test for missing config file
def test_missing_config_file(self, tmp_path): parser = ModelOptArgParser((_ModelArgs, _TrainArgs)) with pytest.raises(FileNotFoundError): parser.parse_args_into_dataclasses( args=["--config", str(tmp_path / "nonexistent.yaml")] )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/torch/opt/plugins/test_modelopt_arg_parser.py` around lines 70 - 79, Add two new unit tests to tests/unit/torch/opt/plugins/test_modelopt_arg_parser.py that exercise error handling in ModelOptArgParser.parse_args_into_dataclasses: one test (e.g., test_missing_config_file) should call parser.parse_args_into_dataclasses with args=["--config", str(tmp_path/"nonexistent.yaml")] and assert it raises FileNotFoundError using pytest.raises; the other test (e.g., test_malformed_yaml_config) should write malformed YAML to a temp file and assert parser.parse_args_into_dataclasses raises the YAML parsing exception (yaml.YAMLError or the parser-specific parse error) using pytest.raises; reference ModelOptArgParser, parse_args_into_dataclasses, and use tmp_path/pytest.raises to locate where to add these tests.examples/llm_qat/main.py (1)
197-198: Consider validatingquant_cfgagainst known configurations.Using
getattr(mtq, quant_args.quant_cfg)with a user-provided string could potentially access unintended module attributes. While this is a CLI argument (not untrusted network input), adding validation against known configs would be more defensive.🛡️ Optional: Add validation against known configs
if quant_args.quant_cfg is not None: + if not hasattr(mtq, quant_args.quant_cfg) or not quant_args.quant_cfg.endswith("_CFG"): + raise ValueError( + f"Unknown quant_cfg: {quant_args.quant_cfg}. " + "See modelopt.torch.quantization.config for available options." + ) quant_args.quant_cfg = getattr(mtq, quant_args.quant_cfg)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/llm_qat/main.py` around lines 197 - 198, The current line uses getattr(mtq, quant_args.quant_cfg) directly which may resolve unintended attributes; validate quant_args.quant_cfg against an explicit whitelist of allowed config names before calling getattr. Update the logic around quant_args.quant_cfg: define the set/list of permitted config names (e.g., known keys in mtq), check that quant_args.quant_cfg is in that list, and only then call getattr(mtq, quant_args.quant_cfg); if not allowed, raise a clear error or exit with a helpful message. Ensure you reference the same symbol names (quant_args.quant_cfg and mtq) so the check is applied right where the getattr is used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.pre-commit-config.yaml:
- Around line 130-142: The pre-commit hook runs examples/llm_qat/main.py but
heavy module-level imports (torch, transformers, modelopt modules) run before
the argument parser (parse_args_into_dataclasses()) and train(), causing
slowness and CUDA failures; fix by lazy-loading those heavy imports (move
torch/transformers/modelopt imports inside train() or into helper functions
called only when not generating docs) or add a tiny dedicated script that
imports only the argument dataclasses for the --generate_docs flow, or check
sys.argv for "--generate_docs" at module import time and skip heavy imports;
reference the examples/llm_qat/main.py entrypoint, the train() function, and
parse_args_into_dataclasses() when making the change.
In `@examples/llm_qat/ARGUMENTS.md`:
- Line 35: Update the metadata help string for
TrainingArguments.model_max_length in examples/llm_qat/main.py to use the
hyphenated form "right-padded" instead of "right padded"; locate the
TrainingArguments definition or the argument registration for model_max_length
(symbol: TrainingArguments.model_max_length or the argument name
"--model_max_length") and modify its help/metadata string accordingly so the
generated ARGUMENTS.md shows "right-padded".
In `@modelopt/torch/opt/plugins/transformers.py`:
- Around line 73-85: The "--config" handling may access args[idx + 1] when
"--config" is last and raise IndexError; update the block that detects
"--config" to check that idx + 1 < len(args) before reading args[idx + 1], and
if missing raise a clear error (e.g., raise ValueError or use the argparse error
mechanism) with a message like "Missing value for --config" so callers get a
friendly error instead of an IndexError; ensure the rest of the flow (setting
config_path, loading YAML, and calling self.set_defaults(**config)) only runs
when a valid path is present.
---
Nitpick comments:
In `@examples/llm_qat/main.py`:
- Around line 197-198: The current line uses getattr(mtq, quant_args.quant_cfg)
directly which may resolve unintended attributes; validate quant_args.quant_cfg
against an explicit whitelist of allowed config names before calling getattr.
Update the logic around quant_args.quant_cfg: define the set/list of permitted
config names (e.g., known keys in mtq), check that quant_args.quant_cfg is in
that list, and only then call getattr(mtq, quant_args.quant_cfg); if not
allowed, raise a clear error or exit with a helpful message. Ensure you
reference the same symbol names (quant_args.quant_cfg and mtq) so the check is
applied right where the getattr is used.
In `@tests/unit/torch/opt/plugins/test_modelopt_arg_parser.py`:
- Around line 70-79: Add two new unit tests to
tests/unit/torch/opt/plugins/test_modelopt_arg_parser.py that exercise error
handling in ModelOptArgParser.parse_args_into_dataclasses: one test (e.g.,
test_missing_config_file) should call parser.parse_args_into_dataclasses with
args=["--config", str(tmp_path/"nonexistent.yaml")] and assert it raises
FileNotFoundError using pytest.raises; the other test (e.g.,
test_malformed_yaml_config) should write malformed YAML to a temp file and
assert parser.parse_args_into_dataclasses raises the YAML parsing exception
(yaml.YAMLError or the parser-specific parse error) using pytest.raises;
reference ModelOptArgParser, parse_args_into_dataclasses, and use
tmp_path/pytest.raises to locate where to add these tests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9730ae6e-e2ad-4302-b952-ad714e48536f
📒 Files selected for processing (13)
.pre-commit-config.yamlexamples/llm_qat/ARGUMENTS.mdexamples/llm_qat/README.mdexamples/llm_qat/configs/finetune.yamlexamples/llm_qat/configs/ptq_eval.yamlexamples/llm_qat/configs/qad_nvfp4.yamlexamples/llm_qat/configs/qat_nvfp4.yamlexamples/llm_qat/launch.shexamples/llm_qat/main.pymodelopt/torch/opt/plugins/transformers.pymodelopt/torch/quantization/plugins/transformers_trainer.pytests/examples/llm_qat/test_llm_qat.pytests/unit/torch/opt/plugins/test_modelopt_arg_parser.py
💤 Files with no reviewable changes (1)
- examples/llm_qat/launch.sh
examples/llm_qat/ARGUMENTS.md
Outdated
| | Argument | Type | Default | Description | | ||
| |----------|------|---------|-------------| | ||
| | `--cache_dir` | `str` | `None` | | | ||
| | `--model_max_length` | `int` | `4096` | Maximum sequence length. Sequences will be right padded (and possibly truncated). | |
There was a problem hiding this comment.
Minor grammar: "right padded" should be "right-padded".
Since this file is auto-generated, the fix should be applied to the source metadata help string in TrainingArguments.model_max_length (in examples/llm_qat/main.py).
🧰 Tools
🪛 LanguageTool
[grammar] ~35-~35: Use a hyphen to join words.
Context: ...sequence length. Sequences will be right padded (and possibly truncated). | | `--...
(QB_NEW_EN_HYPHEN)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/llm_qat/ARGUMENTS.md` at line 35, Update the metadata help string
for TrainingArguments.model_max_length in examples/llm_qat/main.py to use the
hyphenated form "right-padded" instead of "right padded"; locate the
TrainingArguments definition or the argument registration for model_max_length
(symbol: TrainingArguments.model_max_length or the argument name
"--model_max_length") and modify its help/metadata string accordingly so the
generated ARGUMENTS.md shows "right-padded".
| # --config <yaml_file>: load YAML as defaults, CLI args override | ||
| if "--config" in args: | ||
| idx = args.index("--config") | ||
| config_path = args[idx + 1] | ||
| args = args[:idx] + args[idx + 2 :] # strip --config <path> from argv | ||
| import yaml | ||
|
|
||
| with open(config_path) as f: | ||
| config = yaml.safe_load(f) | ||
| if config: | ||
| self.set_defaults(**config) | ||
|
|
||
| return super().parse_args_into_dataclasses(args=args, **kwargs) |
There was a problem hiding this comment.
Missing validation for --config without a value.
If --config is passed as the last argument without a path, args[idx + 1] will raise an IndexError. Consider adding bounds checking similar to the --generate_docs handling.
🛡️ Proposed fix to add validation
# --config <yaml_file>: load YAML as defaults, CLI args override
if "--config" in args:
idx = args.index("--config")
+ if idx + 1 >= len(args):
+ raise ValueError("--config requires a path argument")
config_path = args[idx + 1]
args = args[:idx] + args[idx + 2 :] # strip --config <path> from argv📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # --config <yaml_file>: load YAML as defaults, CLI args override | |
| if "--config" in args: | |
| idx = args.index("--config") | |
| config_path = args[idx + 1] | |
| args = args[:idx] + args[idx + 2 :] # strip --config <path> from argv | |
| import yaml | |
| with open(config_path) as f: | |
| config = yaml.safe_load(f) | |
| if config: | |
| self.set_defaults(**config) | |
| return super().parse_args_into_dataclasses(args=args, **kwargs) | |
| # --config <yaml_file>: load YAML as defaults, CLI args override | |
| if "--config" in args: | |
| idx = args.index("--config") | |
| if idx + 1 >= len(args): | |
| raise ValueError("--config requires a path argument") | |
| config_path = args[idx + 1] | |
| args = args[:idx] + args[idx + 2 :] # strip --config <path> from argv | |
| import yaml | |
| with open(config_path) as f: | |
| config = yaml.safe_load(f) | |
| if config: | |
| self.set_defaults(**config) | |
| return super().parse_args_into_dataclasses(args=args, **kwargs) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modelopt/torch/opt/plugins/transformers.py` around lines 73 - 85, The
"--config" handling may access args[idx + 1] when "--config" is last and raise
IndexError; update the block that detects "--config" to check that idx + 1 <
len(args) before reading args[idx + 1], and if missing raise a clear error
(e.g., raise ValueError or use the argparse error mechanism) with a message like
"Missing value for --config" so callers get a friendly error instead of an
IndexError; ensure the rest of the flow (setting config_path, loading YAML, and
calling self.set_defaults(**config)) only runs when a valid path is present.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1172 +/- ##
===========================================
- Coverage 74.77% 64.41% -10.36%
===========================================
Files 351 351
Lines 40072 40192 +120
===========================================
- Hits 29964 25890 -4074
- Misses 10108 14302 +4194
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
examples/llm_qat/main.py (1)
77-79:⚠️ Potential issue | 🟡 MinorGrammar: "right padded" should be "right-padded".
This is the source of the grammar issue in the generated
ARGUMENTS.md.📝 Proposed fix
model_max_length: int = field( default=4096, metadata={ "help": ( - "Maximum sequence length. Sequences will be right padded (and possibly truncated)." + "Maximum sequence length. Sequences will be right-padded (and possibly truncated)." ) }, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/llm_qat/main.py` around lines 77 - 79, Update the help string that currently reads "Maximum sequence length. Sequences will be right padded (and possibly truncated)." to use a hyphenated adjective: change it to "Maximum sequence length. Sequences will be right-padded (and possibly truncated)." Locate the string in examples/llm_qat/main.py where the argument's "help" value for "Maximum sequence length" is defined (the dict entry with key "help" or the parser.add_argument call providing that help text) and replace the phrase "right padded" with "right-padded".
🧹 Nitpick comments (3)
modelopt/torch/opt/plugins/transformers.py (2)
80-83: Missing error handling for YAML file operations.If the config file doesn't exist or contains invalid YAML, the error messages won't be user-friendly. Consider wrapping in try/except with clearer error context.
🛡️ Proposed improvement
- with open(config_path) as f: - config = yaml.safe_load(f) - if config: - self.set_defaults(**config) + try: + with open(config_path) as f: + config = yaml.safe_load(f) + except FileNotFoundError: + raise ValueError(f"Config file not found: {config_path}") + except yaml.YAMLError as e: + raise ValueError(f"Invalid YAML in config file {config_path}: {e}") + if config: + self.set_defaults(**config)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/opt/plugins/transformers.py` around lines 80 - 83, Wrap the config file read and YAML parsing around a try/except to provide clear, user-friendly errors: catch FileNotFoundError to indicate the provided config_path is missing, catch yaml.YAMLError to indicate the file contains invalid YAML, and catch a generic Exception for any other I/O issues; in the try block keep the existing open(config_path) -> yaml.safe_load flow and call self.set_defaults(**config) when config is truthy, and in each except produce a clear message or re-raise a new exception that mentions config_path and the original error to aid debugging.
78-83: YAML import inside function body — consider moving to module level.The
import yamlstatement is inside the conditional block. While this works, it's unconventional and may cause slight latency on first use. Moving it to the top-level imports would be cleaner.♻️ Suggested refactor
import dataclasses import sys import types from contextlib import contextmanager from pathlib import Path +import yaml import torch import transformersThen remove the
import yamlfrom line 78.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/opt/plugins/transformers.py` around lines 78 - 83, The local "import yaml" inside the block that reads config_path should be moved to the module top-level and the inline import removed; update the module imports to include "import yaml" and then keep the body as: open(config_path), yaml.safe_load, and if config: self.set_defaults(**config) so that yaml is only imported once at module import time and the inline import is deleted.tests/unit/torch/opt/plugins/test_modelopt_arg_parser.py (1)
36-120: Consider adding edge case tests for error handling.The current tests cover happy paths well. Consider adding tests for:
--configwith non-existent file path--configwith invalid YAML syntax--configas last argument without a value💡 Example additional tests
def test_config_file_not_found(self, tmp_path): parser = ModelOptArgParser((_ModelArgs, _TrainArgs)) with pytest.raises(FileNotFoundError): parser.parse_args_into_dataclasses( args=["--config", str(tmp_path / "nonexistent.yaml")] ) def test_config_missing_value(self): parser = ModelOptArgParser((_ModelArgs, _TrainArgs)) with pytest.raises((IndexError, ValueError)): parser.parse_args_into_dataclasses(args=["--config"])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/torch/opt/plugins/test_modelopt_arg_parser.py` around lines 36 - 120, Add edge-case unit tests to tests/unit/torch/opt/plugins/test_modelopt_arg_parser.py covering --config error handling: create a test function (e.g., test_config_file_not_found) that constructs ModelOptArgParser((_ModelArgs, _TrainArgs)) and calls parse_args_into_dataclasses with args=["--config", str(tmp_path/"nonexistent.yaml")] asserting FileNotFoundError; add a test (e.g., test_config_invalid_yaml) that writes malformed YAML to a tmp file and asserts parse_args_into_dataclasses raises yaml.YAMLError or ValueError; add a test (e.g., test_config_missing_value) that calls parse_args_into_dataclasses with args=["--config"] and asserts an IndexError or ValueError is raised. Use the existing ModelOptArgParser and parse_args_into_dataclasses symbols to locate where to add these tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/llm_qat/ARGUMENTS.md`:
- Around line 28-37: The generated TrainingArguments docs omit parameters whose
names are filtered out even when their defaults differ from the HF base; update
the doc generation to surface overridden defaults: in the function
_generate_docs locate the logic that filters out fields by name and change it to
also compare each field's default to the corresponding attribute on
transformers.TrainingArguments (e.g., learning_rate, eval_strategy, do_eval) and
include any field whose default value differs, or alternatively add an explicit
"Overridden defaults" section to ARGUMENTS.md listing each overridden argument
and its non-HF default so users can see values like learning_rate=1e-4.
- Around line 9-11: The generated docs drop union/Optional info (e.g. `str |
None` becoming `str`) because _format_type currently simplifies annotations;
update _format_type so it preserves union/Optional annotations (use
typing.get_origin/get_args or equivalent) and format Union types by joining
inner types with " | " (so a source annotation `str | None` yields `str |
None`), then ensure _generate_docs uses this updated _format_type when rendering
argument types like quant_cfg and other optional fields.
In `@examples/llm_qat/main.py`:
- Around line 64-69: ModelArguments.model_name_or_path lacks metadata helper
text; add a metadata={"help": "..."} to that field so the generated ARGUMENTS.md
includes a description. Edit the ModelArguments class and update the
model_name_or_path declaration to include a clear help string (e.g., "The name
or path of the pretrained model to use.") in the metadata dict, mirroring the
style used for teacher_model.
In `@modelopt/torch/opt/plugins/transformers.py`:
- Around line 116-122: The docstring claims "additional/overridden arguments"
but the current filter (own_fields = [f for f in dataclasses.fields(dc) if
f.name not in hf_training_fields]) only keeps truly new fields; update the
behavior by either (A) changing the text emitted in the lines.append call to
"Only additional arguments are shown below" to remove "overridden", or (B)
change the filtering logic to detect overrides: lookup the corresponding field
in the HuggingFace TrainingArguments (hf_training_fields) and compare default
values (and types if desired) with the dataclass field from
dataclasses.fields(dc), and include the field if the default differs (treating
missing defaults as different); modify the code around is_hf_subclass /
own_fields and the lines.append text accordingly so overridden fields are
included when defaults differ.
---
Outside diff comments:
In `@examples/llm_qat/main.py`:
- Around line 77-79: Update the help string that currently reads "Maximum
sequence length. Sequences will be right padded (and possibly truncated)." to
use a hyphenated adjective: change it to "Maximum sequence length. Sequences
will be right-padded (and possibly truncated)." Locate the string in
examples/llm_qat/main.py where the argument's "help" value for "Maximum sequence
length" is defined (the dict entry with key "help" or the parser.add_argument
call providing that help text) and replace the phrase "right padded" with
"right-padded".
---
Nitpick comments:
In `@modelopt/torch/opt/plugins/transformers.py`:
- Around line 80-83: Wrap the config file read and YAML parsing around a
try/except to provide clear, user-friendly errors: catch FileNotFoundError to
indicate the provided config_path is missing, catch yaml.YAMLError to indicate
the file contains invalid YAML, and catch a generic Exception for any other I/O
issues; in the try block keep the existing open(config_path) -> yaml.safe_load
flow and call self.set_defaults(**config) when config is truthy, and in each
except produce a clear message or re-raise a new exception that mentions
config_path and the original error to aid debugging.
- Around line 78-83: The local "import yaml" inside the block that reads
config_path should be moved to the module top-level and the inline import
removed; update the module imports to include "import yaml" and then keep the
body as: open(config_path), yaml.safe_load, and if config:
self.set_defaults(**config) so that yaml is only imported once at module import
time and the inline import is deleted.
In `@tests/unit/torch/opt/plugins/test_modelopt_arg_parser.py`:
- Around line 36-120: Add edge-case unit tests to
tests/unit/torch/opt/plugins/test_modelopt_arg_parser.py covering --config error
handling: create a test function (e.g., test_config_file_not_found) that
constructs ModelOptArgParser((_ModelArgs, _TrainArgs)) and calls
parse_args_into_dataclasses with args=["--config",
str(tmp_path/"nonexistent.yaml")] asserting FileNotFoundError; add a test (e.g.,
test_config_invalid_yaml) that writes malformed YAML to a tmp file and asserts
parse_args_into_dataclasses raises yaml.YAMLError or ValueError; add a test
(e.g., test_config_missing_value) that calls parse_args_into_dataclasses with
args=["--config"] and asserts an IndexError or ValueError is raised. Use the
existing ModelOptArgParser and parse_args_into_dataclasses symbols to locate
where to add these tests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: bf32b98f-0804-4249-87e1-ce58d13d15e2
📒 Files selected for processing (14)
.pre-commit-config.yamlexamples/llm_qad/README.mdexamples/llm_qat/ARGUMENTS.mdexamples/llm_qat/README.mdexamples/llm_qat/configs/finetune.yamlexamples/llm_qat/configs/ptq_eval.yamlexamples/llm_qat/configs/qad_nvfp4.yamlexamples/llm_qat/configs/qat_nvfp4.yamlexamples/llm_qat/launch.shexamples/llm_qat/main.pymodelopt/torch/opt/plugins/transformers.pymodelopt/torch/quantization/plugins/transformers_trainer.pytests/examples/llm_qat/test_llm_qat.pytests/unit/torch/opt/plugins/test_modelopt_arg_parser.py
💤 Files with no reviewable changes (1)
- examples/llm_qat/launch.sh
✅ Files skipped from review due to trivial changes (5)
- examples/llm_qad/README.md
- examples/llm_qat/configs/qat_nvfp4.yaml
- examples/llm_qat/configs/ptq_eval.yaml
- examples/llm_qat/configs/finetune.yaml
- examples/llm_qat/configs/qad_nvfp4.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
- .pre-commit-config.yaml
- tests/examples/llm_qat/test_llm_qat.py
examples/llm_qat/ARGUMENTS.md
Outdated
| | `--quant_cfg` | `str` | `None` | Specify the quantization format for PTQ/QAT. if specified, PTQ/QAT will be enabled with the specified quantization format | | ||
| | `--calib_size` | `int` | `512` | Specify the calibration size for quantization. The calibration dataset is used to setup the quantization scale parameters for PTQ/QAT. | | ||
| | `--compress` | `bool` | `False` | Whether to compress the model weights after quantization for QLoRA. This is useful for reducing the model size. | |
There was a problem hiding this comment.
Type display shows str but should be str | None for optional fields.
The --quant_cfg argument has default None but type shows as `str`. The _format_type method in the generator should preserve the full union type. Similar issue applies to other optional fields.
This stems from how the type hint is being formatted in _generate_docs. The source class has quant_cfg: str | None, so the generated docs should reflect that.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/llm_qat/ARGUMENTS.md` around lines 9 - 11, The generated docs drop
union/Optional info (e.g. `str | None` becoming `str`) because _format_type
currently simplifies annotations; update _format_type so it preserves
union/Optional annotations (use typing.get_origin/get_args or equivalent) and
format Union types by joining inner types with " | " (so a source annotation
`str | None` yields `str | None`), then ensure _generate_docs uses this updated
_format_type when rendering argument types like quant_cfg and other optional
fields.
examples/llm_qat/ARGUMENTS.md
Outdated
| ## TrainingArguments | ||
|
|
||
| Extends [HuggingFace TrainingArguments](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments). Only additional/overridden arguments are shown below. | ||
|
|
||
| | Argument | Type | Default | Description | | ||
| |----------|------|---------|-------------| | ||
| | `--cache_dir` | `str` | `None` | | | ||
| | `--model_max_length` | `int` | `4096` | Maximum sequence length. Sequences will be right padded (and possibly truncated). | | ||
| | `--lora` | `bool` | `False` | Whether to add LoRA (Low-Rank Adaptation) adapter before training. When using real quantization, the LoRA adapter must be set, as quantized weights will be frozen during training. | | ||
| | `--distill` | `bool` | `False` | Select if training with distillation. | |
There was a problem hiding this comment.
TrainingArguments section omits overridden defaults.
Fields like learning_rate, eval_strategy, do_eval, etc. that override base class defaults are not shown because _generate_docs filters them out by name. Users won't see that this example uses learning_rate=1e-4 instead of the HF default.
Consider documenting the overridden defaults somewhere, or adjusting the generator to include fields with changed defaults.
🧰 Tools
🪛 LanguageTool
[grammar] ~35-~35: Use a hyphen to join words.
Context: ...sequence length. Sequences will be right padded (and possibly truncated). | | `--...
(QB_NEW_EN_HYPHEN)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/llm_qat/ARGUMENTS.md` around lines 28 - 37, The generated
TrainingArguments docs omit parameters whose names are filtered out even when
their defaults differ from the HF base; update the doc generation to surface
overridden defaults: in the function _generate_docs locate the logic that
filters out fields by name and change it to also compare each field's default to
the corresponding attribute on transformers.TrainingArguments (e.g.,
learning_rate, eval_strategy, do_eval) and include any field whose default value
differs, or alternatively add an explicit "Overridden defaults" section to
ARGUMENTS.md listing each overridden argument and its non-HF default so users
can see values like learning_rate=1e-4.
examples/llm_qat/main.py
Outdated
| class ModelArguments(ModelOptHFArguments): | ||
| model_name_or_path: str = field(default="meta-llama/Llama-2-7b-hf") | ||
| teacher_model: str | None = field( | ||
| default=None, | ||
| metadata={"help": ("The name or path of the teacher model to use for distillation.")}, | ||
| ) |
There was a problem hiding this comment.
ModelArguments missing metadata for model_name_or_path.
The model_name_or_path field has no metadata={"help": ...}, resulting in an empty description in the generated ARGUMENTS.md (line 25 shows empty description).
📝 Proposed fix
class ModelArguments(ModelOptHFArguments):
- model_name_or_path: str = field(default="meta-llama/Llama-2-7b-hf")
+ model_name_or_path: str = field(
+ default="meta-llama/Llama-2-7b-hf",
+ metadata={"help": "Path to pretrained model or model identifier from huggingface.co/models."},
+ )
teacher_model: str | None = field(📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| class ModelArguments(ModelOptHFArguments): | |
| model_name_or_path: str = field(default="meta-llama/Llama-2-7b-hf") | |
| teacher_model: str | None = field( | |
| default=None, | |
| metadata={"help": ("The name or path of the teacher model to use for distillation.")}, | |
| ) | |
| class ModelArguments(ModelOptHFArguments): | |
| model_name_or_path: str = field( | |
| default="meta-llama/Llama-2-7b-hf", | |
| metadata={"help": "Path to pretrained model or model identifier from huggingface.co/models."}, | |
| ) | |
| teacher_model: str | None = field( | |
| default=None, | |
| metadata={"help": ("The name or path of the teacher model to use for distillation.")}, | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/llm_qat/main.py` around lines 64 - 69,
ModelArguments.model_name_or_path lacks metadata helper text; add a
metadata={"help": "..."} to that field so the generated ARGUMENTS.md includes a
description. Edit the ModelArguments class and update the model_name_or_path
declaration to include a clear help string (e.g., "The name or path of the
pretrained model to use.") in the metadata dict, mirroring the style used for
teacher_model.
| if is_hf_subclass: | ||
| own_fields = [f for f in dataclasses.fields(dc) if f.name not in hf_training_fields] | ||
| lines.append( | ||
| "Extends [HuggingFace TrainingArguments]" | ||
| "(https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments)." | ||
| " Only additional/overridden arguments are shown below." | ||
| ) |
There was a problem hiding this comment.
Documentation says "overridden arguments" but filter only shows truly new fields.
The docstring at line 121 says "Only additional/overridden arguments are shown below", but the filter at line 117 excludes any field whose name exists in the base TrainingArguments. Fields that override defaults (like learning_rate, eval_strategy in examples/llm_qat/main.py) won't appear in generated docs, which contradicts the stated behavior.
Consider either:
- Update the doc text to say "Only additional arguments" (removing "overridden"), or
- Change the logic to compare default values and include fields where the default differs from the base class.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modelopt/torch/opt/plugins/transformers.py` around lines 116 - 122, The
docstring claims "additional/overridden arguments" but the current filter
(own_fields = [f for f in dataclasses.fields(dc) if f.name not in
hf_training_fields]) only keeps truly new fields; update the behavior by either
(A) changing the text emitted in the lines.append call to "Only additional
arguments are shown below" to remove "overridden", or (B) change the filtering
logic to detect overrides: lookup the corresponding field in the HuggingFace
TrainingArguments (hf_training_fields) and compare default values (and types if
desired) with the dataclass field from dataclasses.fields(dc), and include the
field if the default differs (treating missing defaults as different); modify
the code around is_hf_subclass / own_fields and the lines.append text
accordingly so overridden fields are included when defaults differ.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
examples/llm_qat/ARGUMENTS.md (1)
10-10:⚠️ Potential issue | 🟡 MinorMinor docs grammar: use “right-padded.”
Line 10 still says “right padded”. Since this is auto-generated, update the originating help string so regenerated docs use “right-padded”.
As per coding guidelines,
**/*.{py,cpp,h,hpp,md}: All code (Python, C++, Markdown, etc.) must adhere to coding standards as checked by pre-commit hooks.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/llm_qat/ARGUMENTS.md` at line 10, Update the help string for the CLI option that defines --model_max_length (the add_argument call for '--model_max_length' / the "model_max_length" help text) to use the grammatically correct phrase "right-padded." instead of "right padded" so regenerated docs reflect the change; locate the add_argument or option definition that sets the default/help for model_max_length and change the help string to include "right-padded." and re-run doc generation/pre-commit.
🧹 Nitpick comments (4)
examples/llm_qat/train.py (2)
137-141: Sametrust_remote_codeconcern applies to teacher model loading.The teacher model loading should also use the configurable
trust_remote_codeparameter for consistency.♻️ Suggested fix
teacher_model = transformers.AutoModelForCausalLM.from_pretrained( model_args.teacher_model, cache_dir=training_args.cache_dir, torch_dtype=torch.bfloat16, + trust_remote_code=model_args.trust_remote_code, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/llm_qat/train.py` around lines 137 - 141, The teacher model is being loaded via transformers.AutoModelForCausalLM.from_pretrained without using the configurable trust_remote_code flag; update the teacher_model loading call (the call that assigns to teacher_model and uses model_args.teacher_model) to pass the same trust_remote_code parameter used for the student/model loading (i.e., forward the configured trust_remote_code boolean into from_pretrained) so teacher model remote code is handled consistently.
96-104: Consider exposingtrust_remote_codeas a configurable argument.The model and tokenizer loading accepts user-provided
model_name_or_pathbut doesn't exposetrust_remote_codeas a configurable parameter. While defaulting toFalseis correct for security, some models requiretrust_remote_code=Trueto function. Per coding guidelines, this should be caller-configurable rather than implicitly relying on the default.♻️ Suggested approach
Add
trust_remote_code: bool = FalsetoModelArgumentsinarguments.py, then pass it to the model and tokenizer loading calls:model = transformers.AutoModelForCausalLM.from_pretrained( model_args.model_name_or_path, cache_dir=training_args.cache_dir, torch_dtype=torch.bfloat16, + trust_remote_code=model_args.trust_remote_code, ) ... tokenizer = transformers.AutoTokenizer.from_pretrained( - model_args.model_name_or_path, model_max_length=model_args.model_max_length + model_args.model_name_or_path, + model_max_length=model_args.model_max_length, + trust_remote_code=model_args.trust_remote_code, )As per coding guidelines, "Do not hardcode
trust_remote_code=Truewhen loading Hugging Face Transformers models. Let the caller decide via a parameter; default toFalse."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/llm_qat/train.py` around lines 96 - 104, The loader calls in train.py (transformers.AutoModelForCausalLM.from_pretrained and transformers.AutoTokenizer.from_pretrained) currently don't expose trust_remote_code; add a new boolean field trust_remote_code: bool = False to the ModelArguments dataclass (arguments.py) and then pass model_args.trust_remote_code into both from_pretrained calls (e.g., trust_remote_code=model_args.trust_remote_code) so callers can opt-in while retaining a safe default.examples/llm_qat/utils.py (1)
43-43: Consider moving import to module level for clarity.The runtime import inside the function works but is unconventional. If there's no circular import concern, moving it to the module level would improve readability and enable better static analysis.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/llm_qat/utils.py` at line 43, The file currently does a runtime import of build_blend_dataset and load_blend_config from dataset_utils inside a function; move the import statement for build_blend_dataset and load_blend_config to the top of examples/llm_qat/utils.py (module-level) and delete the in-function import so callers use the module-level imports instead, ensuring there are no circular-import issues before changing.examples/llm_qat/dataset_utils.py (1)
721-723: Consider makingtrust_remote_codeexplicit for defense-in-depth.While
trust_remote_codedefaults toFalsein HuggingFace Transformers, the coding guidelines recommend making this explicit or caller-configurable. For a CLI tool that accepts arbitrary tokenizer paths, consider adding an explicittrust_remote_code=Falseor exposing it as a CLI flag.♻️ Suggested improvement
tokenizer = transformers.AutoTokenizer.from_pretrained( - args.tokenizer, model_max_length=args.max_length + args.tokenizer, model_max_length=args.max_length, trust_remote_code=False )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/llm_qat/dataset_utils.py` around lines 721 - 723, The AutoTokenizer instantiation uses transformers.AutoTokenizer.from_pretrained without explicitly setting trust_remote_code, which should be made explicit for defense-in-depth; update the call in dataset_utils.py where tokenizer is created (transformers.AutoTokenizer.from_pretrained, referencing args.tokenizer and args.max_length) to pass trust_remote_code=False or expose a new CLI flag (e.g., args.trust_remote_code) and pass that value to from_pretrained so callers can opt in explicitly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/llm_qat/llama_factory/launch_llamafactory.sh`:
- Line 143: The --use_fsdp2 flag is parsed and shown in help but has no effect
because default config selection no longer reads USE_FSDP2; either remove the
flag from the help and argument parsing or make it effective by wiring it into
config selection and validation. Fix by updating the argument parsing block that
sets USE_FSDP2 (the case branch handling "--use_fsdp2") and the help echo line
to either delete that option, or if you keep it, validate the provided value
(true/false), set the USE_FSDP2 variable accordingly, and use USE_FSDP2 when
choosing the default config (where default config selection currently ignores
USE_FSDP2); if keeping the flag also emit a clear warning/error when an
unsupported value is provided.
In `@examples/llm_qat/quantize.py`:
- Around line 89-93: The calibration forward_loop currently runs forward passes
with gradients enabled and without evaluation mode; wrap the loop in
torch.inference_mode() and call model.eval() before iterating to disable
autograd and dropout/batchnorm updates (e.g., in forward_loop use model.eval()
and with torch.inference_mode(): for batch in calib_dataloader ...); also
consider restoring the original training state after calibration if the caller
expects the model's training flag unchanged.
In `@examples/llm_qat/README.md`:
- Around line 58-67: Update the README examples to use flag-style boolean
arguments to match ARGUMENTS.md: remove explicit True/False values and use the
flags `--distill` and `--lora` where currently written as `--distill True` or
`--lora True` (for example update the train invocation block containing
`--distill True` and any other occurrences such as the `--lora True` example
around line 225) so the CLI examples consistently use `--distill` and `--lora`
flag syntax.
---
Duplicate comments:
In `@examples/llm_qat/ARGUMENTS.md`:
- Line 10: Update the help string for the CLI option that defines
--model_max_length (the add_argument call for '--model_max_length' / the
"model_max_length" help text) to use the grammatically correct phrase
"right-padded." instead of "right padded" so regenerated docs reflect the
change; locate the add_argument or option definition that sets the default/help
for model_max_length and change the help string to include "right-padded." and
re-run doc generation/pre-commit.
---
Nitpick comments:
In `@examples/llm_qat/dataset_utils.py`:
- Around line 721-723: The AutoTokenizer instantiation uses
transformers.AutoTokenizer.from_pretrained without explicitly setting
trust_remote_code, which should be made explicit for defense-in-depth; update
the call in dataset_utils.py where tokenizer is created
(transformers.AutoTokenizer.from_pretrained, referencing args.tokenizer and
args.max_length) to pass trust_remote_code=False or expose a new CLI flag (e.g.,
args.trust_remote_code) and pass that value to from_pretrained so callers can
opt in explicitly.
In `@examples/llm_qat/train.py`:
- Around line 137-141: The teacher model is being loaded via
transformers.AutoModelForCausalLM.from_pretrained without using the configurable
trust_remote_code flag; update the teacher_model loading call (the call that
assigns to teacher_model and uses model_args.teacher_model) to pass the same
trust_remote_code parameter used for the student/model loading (i.e., forward
the configured trust_remote_code boolean into from_pretrained) so teacher model
remote code is handled consistently.
- Around line 96-104: The loader calls in train.py
(transformers.AutoModelForCausalLM.from_pretrained and
transformers.AutoTokenizer.from_pretrained) currently don't expose
trust_remote_code; add a new boolean field trust_remote_code: bool = False to
the ModelArguments dataclass (arguments.py) and then pass
model_args.trust_remote_code into both from_pretrained calls (e.g.,
trust_remote_code=model_args.trust_remote_code) so callers can opt-in while
retaining a safe default.
In `@examples/llm_qat/utils.py`:
- Line 43: The file currently does a runtime import of build_blend_dataset and
load_blend_config from dataset_utils inside a function; move the import
statement for build_blend_dataset and load_blend_config to the top of
examples/llm_qat/utils.py (module-level) and delete the in-function import so
callers use the module-level imports instead, ensuring there are no
circular-import issues before changing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 4e285767-bcf3-4de5-8296-f9658a97092c
📒 Files selected for processing (33)
.pre-commit-config.yamlexamples/llm_qad/README.mdexamples/llm_qat/ARGUMENTS.mdexamples/llm_qat/README.mdexamples/llm_qat/accelerate_config/fsdp1.yamlexamples/llm_qat/arguments.pyexamples/llm_qat/configs/accelerate/ddp.yamlexamples/llm_qat/configs/accelerate/deepspeed.yamlexamples/llm_qat/configs/accelerate/fsdp2.yamlexamples/llm_qat/configs/dataset/README.mdexamples/llm_qat/configs/dataset/blend.yamlexamples/llm_qat/configs/dataset/blend_example.yamlexamples/llm_qat/configs/dataset/blend_test.yamlexamples/llm_qat/configs/training/finetune.yamlexamples/llm_qat/configs/training/ptq_eval.yamlexamples/llm_qat/configs/training/qad_nvfp4.yamlexamples/llm_qat/configs/training/qat_nvfp4.yamlexamples/llm_qat/dataset_utils.pyexamples/llm_qat/launch.shexamples/llm_qat/llama_factory/README.mdexamples/llm_qat/llama_factory/launch_llamafactory.shexamples/llm_qat/llama_factory/llama_config.yamlexamples/llm_qat/llama_factory/llama_factory.pyexamples/llm_qat/notebooks/QAT_QAD_Walkthrough.ipynbexamples/llm_qat/quantize.pyexamples/llm_qat/simple_qat_train.pyexamples/llm_qat/train.pyexamples/llm_qat/utils.pyexamples/vllm_serve/README.mdmodelopt/torch/opt/plugins/transformers.pymodelopt/torch/quantization/plugins/transformers_trainer.pytests/examples/llm_qat/test_llm_qat.pytests/unit/torch/opt/plugins/test_modelopt_arg_parser.py
💤 Files with no reviewable changes (2)
- examples/llm_qat/launch.sh
- examples/llm_qat/accelerate_config/fsdp1.yaml
✅ Files skipped from review due to trivial changes (14)
- examples/llm_qad/README.md
- examples/vllm_serve/README.md
- examples/llm_qat/llama_factory/llama_config.yaml
- examples/llm_qat/configs/training/ptq_eval.yaml
- examples/llm_qat/configs/training/finetune.yaml
- examples/llm_qat/configs/training/qat_nvfp4.yaml
- examples/llm_qat/configs/training/qad_nvfp4.yaml
- examples/llm_qat/configs/dataset/blend_test.yaml
- examples/llm_qat/configs/dataset/blend.yaml
- tests/unit/torch/opt/plugins/test_modelopt_arg_parser.py
- examples/llm_qat/configs/dataset/README.md
- examples/llm_qat/configs/dataset/blend_example.yaml
- .pre-commit-config.yaml
- examples/llm_qat/arguments.py
🚧 Files skipped from review as they are similar to previous changes (2)
- modelopt/torch/quantization/plugins/transformers_trainer.py
- modelopt/torch/opt/plugins/transformers.py
| echo " <config_file.yaml> YAML config file for llama_factory" | ||
| echo " --accelerate_config Accelerate config file (optional)" | ||
| echo " --use_fsdp2 Use FSDP2 instead of FSDP1 (default: false)" | ||
| echo " --use_fsdp2 Use FSDP2 (default: true)" |
There was a problem hiding this comment.
--use_fsdp2 is currently a no-op.
The flag is parsed and documented, but default config selection no longer uses USE_FSDP2, so --use_fsdp2=false has no effect. Please either remove this flag from the interface or explicitly error/warn when it is provided.
♻️ Suggested cleanup
- echo "Usage: $0 <config_file.yaml> [--accelerate_config <accelerate_config.yaml>] [--use_fsdp2 true|false]"
+ echo "Usage: $0 <config_file.yaml> [--accelerate_config <accelerate_config.yaml>]"
- echo " --use_fsdp2 Use FSDP2 (default: true)"
@@
- USE_FSDP2="true"
@@
- --use_fsdp2*)
- if [[ "$1" != *=* ]]; then shift; fi
- USE_FSDP2="${1#*=}"
- ;;Also applies to: 187-198, 239-244
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/llm_qat/llama_factory/launch_llamafactory.sh` at line 143, The
--use_fsdp2 flag is parsed and shown in help but has no effect because default
config selection no longer reads USE_FSDP2; either remove the flag from the help
and argument parsing or make it effective by wiring it into config selection and
validation. Fix by updating the argument parsing block that sets USE_FSDP2 (the
case branch handling "--use_fsdp2") and the help echo line to either delete that
option, or if you keep it, validate the provided value (true/false), set the
USE_FSDP2 variable accordingly, and use USE_FSDP2 when choosing the default
config (where default config selection currently ignores USE_FSDP2); if keeping
the flag also emit a clear warning/error when an unsupported value is provided.
| def forward_loop(model): | ||
| for batch in calib_dataloader: | ||
| batch = {k: v.to(device) for k, v in batch.items()} | ||
| model(**batch) | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify calibration forward loop currently lacks inference_mode/no_grad guards.
rg -n -C2 'def forward_loop|model\(\*\*batch\)|inference_mode|no_grad' examples/llm_qat/quantize.pyRepository: NVIDIA/Model-Optimizer
Length of output: 286
🏁 Script executed:
# Get broader context around forward_loop and its usage
rg -B10 -A10 'def forward_loop' examples/llm_qat/quantize.py
# Check if model.eval() is called elsewhere in the file
rg -n 'model\.eval\(\)' examples/llm_qat/quantize.py
# Check the full context around line 87-95
sed -n '80,100p' examples/llm_qat/quantize.pyRepository: NVIDIA/Model-Optimizer
Length of output: 1370
🏁 Script executed:
# Check torch imports in the file
head -30 examples/llm_qat/quantize.py | grep -E '^import|^from'
# Check all usages of mtq.quantize to see if others use inference_mode
rg -B5 -A5 'mtq\.quantize' examples/llm_qat/
# Verify torch.inference_mode is available (should be)
rg 'torch\.inference_mode' examples/Repository: NVIDIA/Model-Optimizer
Length of output: 4298
🏁 Script executed:
# Get full imports section
head -50 examples/llm_qat/quantize.py
# Check the calibrate function in simple_qat_train.py
rg -B10 -A10 'def calibrate' examples/llm_qat/simple_qat_train.pyRepository: NVIDIA/Model-Optimizer
Length of output: 2695
Add inference mode and eval mode to calibration loop.
The forward_loop function (lines 89–93) executes forward passes without torch.inference_mode() or model evaluation mode, causing unnecessary autograd graph construction and memory overhead during PTQ calibration.
Suggested fix
def forward_loop(model):
- for batch in calib_dataloader:
- batch = {k: v.to(device) for k, v in batch.items()}
- model(**batch)
+ model.eval()
+ with torch.inference_mode():
+ for batch in calib_dataloader:
+ batch = {k: v.to(device) for k, v in batch.items()}
+ model(**batch)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/llm_qat/quantize.py` around lines 89 - 93, The calibration
forward_loop currently runs forward passes with gradients enabled and without
evaluation mode; wrap the loop in torch.inference_mode() and call model.eval()
before iterating to disable autograd and dropout/batchnorm updates (e.g., in
forward_loop use model.eval() and with torch.inference_mode(): for batch in
calib_dataloader ...); also consider restoring the original training state after
calibration if the caller expects the model's training flag unchanged.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
examples/llm_qat/dataset_utils.py (1)
669-671: Exposetrust_remote_codeas a CLI parameter.While
trust_remote_codedefaults toFalse(safe), the coding guidelines recommend letting callers configure this parameter. For CLI tools, users should be able to opt-in when loading tokenizers that require custom code.♻️ Suggested improvement
parser.add_argument("--tokenizer", required=True, help="Tokenizer name or path") parser.add_argument("--max_length", type=int, default=4096, help="Max sequence length") + parser.add_argument( + "--trust_remote_code", + action="store_true", + default=False, + help="Allow executing custom code from the tokenizer repository (use with caution)", + ) args = parser.parse_args() config = load_blend_config(args.config) tokenizer = transformers.AutoTokenizer.from_pretrained( - args.tokenizer, model_max_length=args.max_length + args.tokenizer, + model_max_length=args.max_length, + trust_remote_code=args.trust_remote_code, )As per coding guidelines: "Do not hardcode
trust_remote_code=Truewhen loading Hugging Face Transformers models. Let the caller decide via a parameter; default toFalse."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/llm_qat/dataset_utils.py` around lines 669 - 671, Add a CLI flag to expose Hugging Face's trust_remote_code instead of hardcoding it: update the argument parser to accept something like --trust-remote-code (default False) and pass that flag into the tokenizer loading call (transformers.AutoTokenizer.from_pretrained) using args.trust_remote_code, so the tokenizer creation (tokenizer = transformers.AutoTokenizer.from_pretrained(args.tokenizer, model_max_length=args.max_length, trust_remote_code=args.trust_remote_code)) respects the caller's choice.tests/examples/llm_qat/test_llm_qat.py (2)
82-82: Normalizetiny_llama_pathbefore appending it toextra_cmd_args.
_run_quantize()is typed as takinglist[str], and the other filesystem arguments in these calls are already wrapped withstr(...). Convertingtiny_llama_pathhere keeps the argv contract consistent and avoids surprises if the fixture returns aPath.🧩 Proposed fix
- "--model_name_or_path", tiny_llama_path, + "--model_name_or_path", str(tiny_llama_path),Apply the same change in all three
_run_quantize()call sites.Also applies to: 105-105, 130-130
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/examples/llm_qat/test_llm_qat.py` at line 82, The test passes tiny_llama_path directly into the argv list which can be a Path object; update all three _run_quantize(...) call sites to convert tiny_llama_path to a string (e.g., str(tiny_llama_path)) when adding it to extra_cmd_args so the list[str] contract for _run_quantize and consistency with other filesystem args is preserved; locate the calls that pass "--model_name_or_path", tiny_llama_path and replace tiny_llama_path with its stringified form.
35-35: Avoid relying onPATHfor the Python and Accelerate entry points.Using bare
pythonandacceleratecan pick a different environment than the one running pytest, which makes these integration tests brittle in CI and local virtualenvs. Ifrun_example_command()does not already normalize the launcher, prefer invoking both steps from the active interpreter.Also applies to: 52-54
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/examples/llm_qat/test_llm_qat.py` at line 35, The test currently invokes the example scripts via the bare strings "python" and "accelerate" which can resolve to the wrong interpreter; update the test to launch both steps with the active Python interpreter instead: replace occurrences of "python" with the running interpreter (use sys.executable) and replace "accelerate" with a module invocation via the interpreter (i.e. sys.executable with "-m accelerate" or equivalent) when calling run_example_command(); update both the quantize step (strings "python", "quantize.py") and the train/accelerate step (strings "accelerate", "launch" / related args for "train.py") so the commands use the active interpreter rather than relying on PATH.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/llm_qat/dataset_utils.py`:
- Around line 224-231: The current broad except Exception in the tokenization
fallback hides unexpected errors; change it to catch only expected
tokenization-related errors (e.g., ValueError, UnicodeEncodeError, and any
specific tokenizer error class your tokenizer exposes) around the tokenizer
invocation, keep the logger.warning and the fallback pad/label construction
using pad_id, max_length, and IGNORE_TOKEN_ID, and for any other exceptions
re-raise them so they surface (i.e., replace "except Exception as e" with a
narrowed except tuple and add a bare raise in an outer except to propagate
unexpected errors).
---
Nitpick comments:
In `@examples/llm_qat/dataset_utils.py`:
- Around line 669-671: Add a CLI flag to expose Hugging Face's trust_remote_code
instead of hardcoding it: update the argument parser to accept something like
--trust-remote-code (default False) and pass that flag into the tokenizer
loading call (transformers.AutoTokenizer.from_pretrained) using
args.trust_remote_code, so the tokenizer creation (tokenizer =
transformers.AutoTokenizer.from_pretrained(args.tokenizer,
model_max_length=args.max_length, trust_remote_code=args.trust_remote_code))
respects the caller's choice.
In `@tests/examples/llm_qat/test_llm_qat.py`:
- Line 82: The test passes tiny_llama_path directly into the argv list which can
be a Path object; update all three _run_quantize(...) call sites to convert
tiny_llama_path to a string (e.g., str(tiny_llama_path)) when adding it to
extra_cmd_args so the list[str] contract for _run_quantize and consistency with
other filesystem args is preserved; locate the calls that pass
"--model_name_or_path", tiny_llama_path and replace tiny_llama_path with its
stringified form.
- Line 35: The test currently invokes the example scripts via the bare strings
"python" and "accelerate" which can resolve to the wrong interpreter; update the
test to launch both steps with the active Python interpreter instead: replace
occurrences of "python" with the running interpreter (use sys.executable) and
replace "accelerate" with a module invocation via the interpreter (i.e.
sys.executable with "-m accelerate" or equivalent) when calling
run_example_command(); update both the quantize step (strings "python",
"quantize.py") and the train/accelerate step (strings "accelerate", "launch" /
related args for "train.py") so the commands use the active interpreter rather
than relying on PATH.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7d1f6438-abe5-4f9c-94c4-93bd1e217408
📒 Files selected for processing (33)
.pre-commit-config.yamlexamples/llm_qad/README.mdexamples/llm_qat/ARGUMENTS.mdexamples/llm_qat/README.mdexamples/llm_qat/accelerate_config/fsdp1.yamlexamples/llm_qat/arguments.pyexamples/llm_qat/configs/accelerate/ddp.yamlexamples/llm_qat/configs/accelerate/deepspeed.yamlexamples/llm_qat/configs/accelerate/fsdp2.yamlexamples/llm_qat/configs/dataset/README.mdexamples/llm_qat/configs/dataset/blend.yamlexamples/llm_qat/configs/dataset/blend_example.yamlexamples/llm_qat/configs/dataset/blend_test.yamlexamples/llm_qat/configs/train/finetune.yamlexamples/llm_qat/configs/train/ptq_eval.yamlexamples/llm_qat/configs/train/qad_nvfp4.yamlexamples/llm_qat/configs/train/qat_nvfp4.yamlexamples/llm_qat/dataset_utils.pyexamples/llm_qat/launch.shexamples/llm_qat/llama_factory/README.mdexamples/llm_qat/llama_factory/launch_llamafactory.shexamples/llm_qat/llama_factory/llama_config.yamlexamples/llm_qat/llama_factory/llama_factory.pyexamples/llm_qat/notebooks/QAT_QAD_Walkthrough.ipynbexamples/llm_qat/quantize.pyexamples/llm_qat/simple_qat_train.pyexamples/llm_qat/train.pyexamples/llm_qat/utils.pyexamples/vllm_serve/README.mdmodelopt/torch/opt/plugins/transformers.pymodelopt/torch/quantization/plugins/transformers_trainer.pytests/examples/llm_qat/test_llm_qat.pytests/unit/torch/opt/plugins/test_modelopt_arg_parser.py
💤 Files with no reviewable changes (2)
- examples/llm_qat/accelerate_config/fsdp1.yaml
- examples/llm_qat/launch.sh
✅ Files skipped from review due to trivial changes (13)
- examples/llm_qat/configs/train/ptq_eval.yaml
- examples/vllm_serve/README.md
- examples/llm_qat/llama_factory/llama_config.yaml
- examples/llm_qat/configs/train/finetune.yaml
- examples/llm_qad/README.md
- examples/llm_qat/configs/train/qat_nvfp4.yaml
- .pre-commit-config.yaml
- examples/llm_qat/configs/dataset/blend.yaml
- examples/llm_qat/configs/train/qad_nvfp4.yaml
- examples/llm_qat/configs/dataset/blend_test.yaml
- examples/llm_qat/configs/dataset/blend_example.yaml
- examples/llm_qat/llama_factory/README.md
- examples/llm_qat/configs/dataset/README.md
🚧 Files skipped from review as they are similar to previous changes (5)
- examples/llm_qat/llama_factory/llama_factory.py
- tests/unit/torch/opt/plugins/test_modelopt_arg_parser.py
- examples/llm_qat/utils.py
- examples/llm_qat/arguments.py
- examples/llm_qat/train.py
| except Exception as e: | ||
| logger.warning("Failed to tokenize sample: %s. Skipping.", e) | ||
| pad_id = tokenizer.pad_token_id or tokenizer.eos_token_id or 0 | ||
| return { | ||
| "input_ids": [pad_id] * max_length, | ||
| "attention_mask": [0] * max_length, | ||
| "labels": [IGNORE_TOKEN_ID] * max_length, | ||
| } |
There was a problem hiding this comment.
Broad exception handling may hide underlying issues.
Catching all Exception types could mask unexpected errors like MemoryError, KeyboardInterrupt (caught via BaseException), or tokenizer misconfiguration issues that should surface to the developer.
Consider narrowing to expected tokenization failures:
🛠️ Suggested refinement
- except Exception as e:
- logger.warning("Failed to tokenize sample: %s. Skipping.", e)
+ except (ValueError, TypeError, KeyError) as e:
+ logger.warning("Failed to tokenize sample: %s. Skipping.", e)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| except Exception as e: | |
| logger.warning("Failed to tokenize sample: %s. Skipping.", e) | |
| pad_id = tokenizer.pad_token_id or tokenizer.eos_token_id or 0 | |
| return { | |
| "input_ids": [pad_id] * max_length, | |
| "attention_mask": [0] * max_length, | |
| "labels": [IGNORE_TOKEN_ID] * max_length, | |
| } | |
| except (ValueError, TypeError, KeyError) as e: | |
| logger.warning("Failed to tokenize sample: %s. Skipping.", e) | |
| pad_id = tokenizer.pad_token_id or tokenizer.eos_token_id or 0 | |
| return { | |
| "input_ids": [pad_id] * max_length, | |
| "attention_mask": [0] * max_length, | |
| "labels": [IGNORE_TOKEN_ID] * max_length, | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/llm_qat/dataset_utils.py` around lines 224 - 231, The current broad
except Exception in the tokenization fallback hides unexpected errors; change it
to catch only expected tokenization-related errors (e.g., ValueError,
UnicodeEncodeError, and any specific tokenizer error class your tokenizer
exposes) around the tokenizer invocation, keep the logger.warning and the
fallback pad/label construction using pad_id, max_length, and IGNORE_TOKEN_ID,
and for any other exceptions re-raise them so they surface (i.e., replace
"except Exception as e" with a narrowed except tuple and add a bare raise in an
outer except to propagate unexpected errors).
There was a problem hiding this comment.
♻️ Duplicate comments (5)
examples/llm_qat/ARGUMENTS.md (2)
10-10:⚠️ Potential issue | 🟡 MinorGrammar: "right padded" should be hyphenated.
Since this file is auto-generated, fix the source help text in the
model_max_lengthfield metadata inexamples/llm_qat/arguments.pyto use "right-padded".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/llm_qat/ARGUMENTS.md` at line 10, The generated docs show "right padded" unhyphenated; update the source help text for the model_max_length argument in examples/llm_qat/arguments.py (the metadata for the model_max_length argument/flag) to use "right-padded" instead of "right padded" so the auto-generated ARGUMENTS.md will render the hyphenated form; edit the help/message string in the model_max_length field definition where its help/description is set.
11-11:⚠️ Potential issue | 🟡 MinorType display drops Optional/Union annotation.
The
--teacher_modelfield has defaultNonebut type shows as`str`instead of`str | None`. This stems from_format_typein the generator not preserving union types.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/llm_qat/ARGUMENTS.md` at line 11, The rendered arg type drops Optional/Union (so `--teacher_model` shows `str` instead of `str | None`) because _format_type in the generator flattens or ignores union/Optional metadata; update _format_type to detect typing.Union/typing.Optional (use typing.get_origin and typing.get_args) and reconstruct the type string by joining inner argument type names with " | " (ensuring NoneType is rendered as "None"), then return that composed union string; locate the generator function named _format_type and modify its branch that handles non-primitive/annotated types to handle get_origin==Union and format get_args accordingly.modelopt/torch/opt/plugins/transformers.py (2)
73-85:⚠️ Potential issue | 🟡 MinorMissing bounds check for
--configargument.If
--configis passed as the last argument without a path,args[idx + 1]will raise anIndexError. Add validation before accessing the next element.🛡️ Proposed fix
# --config <yaml_file>: load YAML as defaults, CLI args override if "--config" in args: idx = args.index("--config") + if idx + 1 >= len(args): + raise ValueError("--config requires a path argument") config_path = args[idx + 1] args = args[:idx] + args[idx + 2 :] # strip --config <path> from argv🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/opt/plugins/transformers.py` around lines 73 - 85, The code accesses args[idx + 1] without checking bounds when parsing the "--config" flag; add a guard after finding idx to ensure idx + 1 < len(args) and if not raise a clear exception (e.g., ValueError("Missing path after --config")) or call argparse/error handling, so you don't get IndexError. Update the block that computes config_path and strips args (references: args, idx, config_path, yaml.safe_load, self.set_defaults, and super().parse_args_into_dataclasses) to validate the presence of the path before opening the file and only proceed to open/configure if the check passes.
116-122:⚠️ Potential issue | 🟡 MinorDocumentation claims "overridden" but filter excludes overridden fields.
The docstring says "additional/overridden arguments" but the filter at line 117 excludes any field whose name exists in the base
TrainingArguments. Fields that override defaults won't appear in generated docs.Either update the text to say "Only additional arguments" or adjust the logic to include fields with changed defaults.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/opt/plugins/transformers.py` around lines 116 - 122, The doc claims "additional/overridden arguments" but the current filter (own_fields = [f for f in dataclasses.fields(dc) if f.name not in hf_training_fields]) drops any field that exists on HuggingFace TrainingArguments, so overridden fields are omitted; fix by either (A) updating the text to "Only additional arguments are shown below" or (B) include overridden fields by changing the filter to keep fields whose names are in hf_training_fields only when their defaults differ from the base TrainingArguments defaults (e.g., build a map of base_fields = {f.name: f for f in dataclasses.fields(transformers.TrainingArguments)} and include f if f.name not in hf_training_fields OR base_fields.get(f.name) has a different default/default_factory/value than f), and then update lines and the comment accordingly (refer to is_hf_subclass, dataclasses.fields(dc), hf_training_fields, and the lines list).examples/llm_qat/dataset_utils.py (1)
224-231:⚠️ Potential issue | 🟡 MinorBroad exception handling may mask underlying issues.
Catching all
Exceptiontypes could hide unexpected errors likeMemoryErroror tokenizer misconfiguration. Consider narrowing to expected tokenization failures.♻️ Suggested refinement
- except Exception as e: + except (ValueError, TypeError, KeyError) as e: logger.warning("Failed to tokenize sample: %s. Skipping.", e)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/llm_qat/dataset_utils.py` around lines 224 - 231, The current blanket except Exception in the tokenization fallback (around tokenizer.pad_token_id, pad_id and IGNORE_TOKEN_ID) is too broad; restrict it to expected tokenization errors (e.g., ValueError, TypeError, UnicodeEncodeError and any tokenizer-specific error class from your tokenizer library such as TokenizerError) so you don't swallow system-level errors like MemoryError, and keep the same fallback return. Also include the exception class/name in the logger message to aid debugging (e.g., log type(e) and message) while leaving the pad_id/labels fallback logic unchanged.
🧹 Nitpick comments (5)
examples/llm_qat/train.py (1)
134-141: Consider raisingValueErrorinstead of bareassertfor better error handling.The
assertwill be stripped in optimized mode (python -O). For user-facing validation, prefer an explicit check withValueError.♻️ Proposed fix
if training_args.distill: - assert model_args.teacher_model is not None, "Teacher model is required for distillation." + if model_args.teacher_model is None: + raise ValueError("--teacher_model is required when --distill is enabled.") teacher_model = transformers.AutoModelForCausalLM.from_pretrained(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/llm_qat/train.py` around lines 134 - 141, Replace the bare assert that checks for the teacher model when training_args.distill is true with an explicit runtime check that raises ValueError if model_args.teacher_model is None; update the logic around training_args.distill and model_args.teacher_model (before calling transformers.AutoModelForCausalLM.from_pretrained) to raise ValueError("Teacher model is required for distillation.") so the error is always raised in user-facing validation and not stripped in optimized mode, then proceed to load teacher_model as before.examples/llm_qat/dataset_utils.py (1)
150-156: Specify explicit encoding when opening YAML config file.Use
encoding="utf-8"for consistent cross-platform behavior.♻️ Proposed fix
def load_blend_config(config_path: str) -> BlendConfig: """Parse a dataset blend YAML file into a :class:`BlendConfig`.""" - with open(config_path) as f: + with open(config_path, encoding="utf-8") as f: raw = yaml.safe_load(f)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/llm_qat/dataset_utils.py` around lines 150 - 156, The function load_blend_config currently opens the YAML file without an explicit encoding which can cause cross-platform inconsistencies; update the open call in load_blend_config to specify encoding="utf-8" when opening config_path so yaml.safe_load reads a UTF-8 stream consistently (keep the rest of the logic that constructs DatasetSourceConfig and returns BlendConfig unchanged).modelopt/torch/opt/plugins/transformers.py (2)
80-81: Specify explicit encoding when opening files.Use
encoding="utf-8"for consistent cross-platform behavior when reading YAML config files.♻️ Proposed fix
- with open(config_path) as f: + with open(config_path, encoding="utf-8") as f: config = yaml.safe_load(f)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/opt/plugins/transformers.py` around lines 80 - 81, The file opens the YAML config with open(config_path) which can be platform-dependent; update the open call that reads the config (the with open(...) as f: block where config = yaml.safe_load(f) is executed) to explicitly specify encoding="utf-8" so the YAML is read consistently across platforms, i.e. change the with open usage to include encoding="utf-8".
149-149: Specify encoding when writing generated docs.For consistency with the read path suggested above, use explicit UTF-8 encoding when writing the generated markdown file.
♻️ Proposed fix
- Path(output_path).write_text("\n".join(lines) + "\n") + Path(output_path).write_text("\n".join(lines) + "\n", encoding="utf-8")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/opt/plugins/transformers.py` at line 149, The write to disk uses Path(output_path).write_text("\n".join(lines) + "\n") without an explicit encoding; change the call to pass encoding="utf-8" so the generated markdown is written using UTF-8 (i.e. update the Path(...).write_text invocation that writes the joined lines from the lines variable for output_path to include encoding="utf-8").tests/examples/llm_qat/test_llm_qat.py (1)
43-68:--fsdp_transformer_layer_cls_to_wrapmay be ignored for non-FSDP backends.This argument is FSDP-specific but is passed to all backends including
ddpanddeepspeed. While HfArgumentParser typically ignores unknown/irrelevant args, consider conditionally adding it only for FSDP backends for clarity.♻️ Proposed fix
def _run_train(extra_cmd_args: list[str], backend: str = "fsdp2"): config_file = BACKEND_CONFIGS[backend] gradient_args = ( ["--gradient_checkpointing", "True"] if backend in GRADIENT_CHECKPOINTING_BACKENDS else [] ) + fsdp_args = ( + ["--fsdp_transformer_layer_cls_to_wrap", "LlamaDecoderLayer"] + if backend.startswith("fsdp") + else [] + ) run_example_command( [ "accelerate", "launch", "--config-file", config_file, "train.py", "--dataset_config", "configs/dataset/blend_test.yaml", - "--fsdp_transformer_layer_cls_to_wrap", "LlamaDecoderLayer", "--num_train_epochs", "0.3", "--learning_rate", "1e-5", "--save_steps", "5", "--eval_steps", "5", "--train_samples", "256", "--eval_samples", "64", + *fsdp_args, *gradient_args, *extra_cmd_args, ],🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/examples/llm_qat/test_llm_qat.py` around lines 43 - 68, The test helper _run_train currently always passes the FSDP-only flag "--fsdp_transformer_layer_cls_to_wrap" to run_example_command, which can be confusing for non-FSDP backends; change _run_train to append that flag only when the selected backend is an FSDP backend (e.g., check backend membership in an FSDP_BACKENDS set or reuse GRADIENT_CHECKPOINTING_BACKENDS if it denotes FSDP backends) so that run_example_command is invoked without FSDP-specific args for ddp/deepspeed backends; update references in _run_train where the argument list is built (alongside BACKEND_CONFIGS, GRADIENT_CHECKPOINTING_BACKENDS, and run_example_command) to conditionally include the "--fsdp_transformer_layer_cls_to_wrap", "LlamaDecoderLayer" pair.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@examples/llm_qat/ARGUMENTS.md`:
- Line 10: The generated docs show "right padded" unhyphenated; update the
source help text for the model_max_length argument in
examples/llm_qat/arguments.py (the metadata for the model_max_length
argument/flag) to use "right-padded" instead of "right padded" so the
auto-generated ARGUMENTS.md will render the hyphenated form; edit the
help/message string in the model_max_length field definition where its
help/description is set.
- Line 11: The rendered arg type drops Optional/Union (so `--teacher_model`
shows `str` instead of `str | None`) because _format_type in the generator
flattens or ignores union/Optional metadata; update _format_type to detect
typing.Union/typing.Optional (use typing.get_origin and typing.get_args) and
reconstruct the type string by joining inner argument type names with " | "
(ensuring NoneType is rendered as "None"), then return that composed union
string; locate the generator function named _format_type and modify its branch
that handles non-primitive/annotated types to handle get_origin==Union and
format get_args accordingly.
In `@examples/llm_qat/dataset_utils.py`:
- Around line 224-231: The current blanket except Exception in the tokenization
fallback (around tokenizer.pad_token_id, pad_id and IGNORE_TOKEN_ID) is too
broad; restrict it to expected tokenization errors (e.g., ValueError, TypeError,
UnicodeEncodeError and any tokenizer-specific error class from your tokenizer
library such as TokenizerError) so you don't swallow system-level errors like
MemoryError, and keep the same fallback return. Also include the exception
class/name in the logger message to aid debugging (e.g., log type(e) and
message) while leaving the pad_id/labels fallback logic unchanged.
In `@modelopt/torch/opt/plugins/transformers.py`:
- Around line 73-85: The code accesses args[idx + 1] without checking bounds
when parsing the "--config" flag; add a guard after finding idx to ensure idx +
1 < len(args) and if not raise a clear exception (e.g., ValueError("Missing path
after --config")) or call argparse/error handling, so you don't get IndexError.
Update the block that computes config_path and strips args (references: args,
idx, config_path, yaml.safe_load, self.set_defaults, and
super().parse_args_into_dataclasses) to validate the presence of the path before
opening the file and only proceed to open/configure if the check passes.
- Around line 116-122: The doc claims "additional/overridden arguments" but the
current filter (own_fields = [f for f in dataclasses.fields(dc) if f.name not in
hf_training_fields]) drops any field that exists on HuggingFace
TrainingArguments, so overridden fields are omitted; fix by either (A) updating
the text to "Only additional arguments are shown below" or (B) include
overridden fields by changing the filter to keep fields whose names are in
hf_training_fields only when their defaults differ from the base
TrainingArguments defaults (e.g., build a map of base_fields = {f.name: f for f
in dataclasses.fields(transformers.TrainingArguments)} and include f if f.name
not in hf_training_fields OR base_fields.get(f.name) has a different
default/default_factory/value than f), and then update lines and the comment
accordingly (refer to is_hf_subclass, dataclasses.fields(dc),
hf_training_fields, and the lines list).
---
Nitpick comments:
In `@examples/llm_qat/dataset_utils.py`:
- Around line 150-156: The function load_blend_config currently opens the YAML
file without an explicit encoding which can cause cross-platform
inconsistencies; update the open call in load_blend_config to specify
encoding="utf-8" when opening config_path so yaml.safe_load reads a UTF-8 stream
consistently (keep the rest of the logic that constructs DatasetSourceConfig and
returns BlendConfig unchanged).
In `@examples/llm_qat/train.py`:
- Around line 134-141: Replace the bare assert that checks for the teacher model
when training_args.distill is true with an explicit runtime check that raises
ValueError if model_args.teacher_model is None; update the logic around
training_args.distill and model_args.teacher_model (before calling
transformers.AutoModelForCausalLM.from_pretrained) to raise ValueError("Teacher
model is required for distillation.") so the error is always raised in
user-facing validation and not stripped in optimized mode, then proceed to load
teacher_model as before.
In `@modelopt/torch/opt/plugins/transformers.py`:
- Around line 80-81: The file opens the YAML config with open(config_path) which
can be platform-dependent; update the open call that reads the config (the with
open(...) as f: block where config = yaml.safe_load(f) is executed) to
explicitly specify encoding="utf-8" so the YAML is read consistently across
platforms, i.e. change the with open usage to include encoding="utf-8".
- Line 149: The write to disk uses Path(output_path).write_text("\n".join(lines)
+ "\n") without an explicit encoding; change the call to pass encoding="utf-8"
so the generated markdown is written using UTF-8 (i.e. update the
Path(...).write_text invocation that writes the joined lines from the lines
variable for output_path to include encoding="utf-8").
In `@tests/examples/llm_qat/test_llm_qat.py`:
- Around line 43-68: The test helper _run_train currently always passes the
FSDP-only flag "--fsdp_transformer_layer_cls_to_wrap" to run_example_command,
which can be confusing for non-FSDP backends; change _run_train to append that
flag only when the selected backend is an FSDP backend (e.g., check backend
membership in an FSDP_BACKENDS set or reuse GRADIENT_CHECKPOINTING_BACKENDS if
it denotes FSDP backends) so that run_example_command is invoked without
FSDP-specific args for ddp/deepspeed backends; update references in _run_train
where the argument list is built (alongside BACKEND_CONFIGS,
GRADIENT_CHECKPOINTING_BACKENDS, and run_example_command) to conditionally
include the "--fsdp_transformer_layer_cls_to_wrap", "LlamaDecoderLayer" pair.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3c9bac8a-7059-4aa8-aa2f-adab7aefd0d7
📒 Files selected for processing (34)
.pre-commit-config.yamlexamples/llm_qad/README.mdexamples/llm_qat/ARGUMENTS.mdexamples/llm_qat/README.mdexamples/llm_qat/accelerate_config/fsdp1.yamlexamples/llm_qat/arguments.pyexamples/llm_qat/configs/accelerate/ddp.yamlexamples/llm_qat/configs/accelerate/deepspeed.yamlexamples/llm_qat/configs/accelerate/fsdp2.yamlexamples/llm_qat/configs/dataset/README.mdexamples/llm_qat/configs/dataset/blend.yamlexamples/llm_qat/configs/dataset/blend_example.yamlexamples/llm_qat/configs/dataset/blend_test.yamlexamples/llm_qat/configs/train/finetune.yamlexamples/llm_qat/configs/train/ptq_eval.yamlexamples/llm_qat/configs/train/qad_nvfp4.yamlexamples/llm_qat/configs/train/qat_nvfp4.yamlexamples/llm_qat/dataset_utils.pyexamples/llm_qat/launch.shexamples/llm_qat/llama_factory/README.mdexamples/llm_qat/llama_factory/launch_llamafactory.shexamples/llm_qat/llama_factory/llama_config.yamlexamples/llm_qat/llama_factory/llama_factory.pyexamples/llm_qat/notebooks/QAT_QAD_Walkthrough.ipynbexamples/llm_qat/quantize.pyexamples/llm_qat/simple_qat_train.pyexamples/llm_qat/train.pyexamples/llm_qat/utils.pyexamples/vllm_serve/README.mdmodelopt/torch/opt/plugins/transformers.pymodelopt/torch/quantization/plugins/transformers_trainer.pymodelopt_recipes/general/ptq/int4_blockwise_weight_only.ymltests/examples/llm_qat/test_llm_qat.pytests/unit/torch/opt/plugins/test_modelopt_arg_parser.py
💤 Files with no reviewable changes (2)
- examples/llm_qat/accelerate_config/fsdp1.yaml
- examples/llm_qat/launch.sh
✅ Files skipped from review due to trivial changes (15)
- examples/llm_qat/configs/train/ptq_eval.yaml
- examples/vllm_serve/README.md
- examples/llm_qad/README.md
- examples/llm_qat/llama_factory/llama_config.yaml
- examples/llm_qat/configs/dataset/blend_test.yaml
- examples/llm_qat/configs/train/finetune.yaml
- modelopt_recipes/general/ptq/int4_blockwise_weight_only.yml
- examples/llm_qat/notebooks/QAT_QAD_Walkthrough.ipynb
- examples/llm_qat/configs/train/qad_nvfp4.yaml
- examples/llm_qat/llama_factory/launch_llamafactory.sh
- examples/llm_qat/configs/dataset/blend.yaml
- tests/unit/torch/opt/plugins/test_modelopt_arg_parser.py
- examples/llm_qat/configs/dataset/blend_example.yaml
- examples/llm_qat/configs/dataset/README.md
- examples/llm_qat/arguments.py
🚧 Files skipped from review as they are similar to previous changes (7)
- examples/llm_qat/configs/train/qat_nvfp4.yaml
- examples/llm_qat/llama_factory/README.md
- examples/llm_qat/simple_qat_train.py
- examples/llm_qat/llama_factory/llama_factory.py
- .pre-commit-config.yaml
- modelopt/torch/quantization/plugins/transformers_trainer.py
- examples/llm_qat/quantize.py
…elOpt argument parser - Add qlora_nvfp4.yaml config and expand qat/qad/finetune YAML configs with full parameter sets - Remove ptq_eval.yaml (superseded by unified quantize.py flow) - Add DistillArguments class in modelopt/torch/distill with distill, teacher_model, criterion fields and to_distill_kwargs() helper - Move --distill from TrainingArguments to DistillArguments - Remove TrainModelArguments from train.py, use DistillArguments instead - Reorder transformers.py: patching on top, arguments middle, training bottom - Include QuantizeArguments in train.py parser for ARGUMENTS.md generation - Rename output_dir -> quantize_output_dir in QuantizeArguments to avoid conflict with HF TrainingArguments.output_dir - Regenerate ARGUMENTS.md with all argument classes - Group distillation args in qad_nvfp4.yaml config - Update README.md, tests, and quantize.py for renamed field Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> Signed-off-by: realAsma <[email protected]>
Summary
ModelOptArgParser&ModelOptHFArgumentsinmodelopt.torch.opt.plugins.transformers:an
HfArgumentParsersubclass that adds--config <yaml>support (YAML defaults with CLIoverrides) and
--generate_docsto auto-generate anARGUMENTS.mdreference from dataclassmetadata.
launch.shwith declarative YAML configs and a simplifiedtrain.pythat usesModelOptArgParserfor argument parsing.configs/dataset/) withdataset_utils.pysupporting multi-source blending, weighted sampling, and train/eval splits.quantize.pyfor PTQ/export workflows without requiring full training setup.configs/accelerate/,configs/dataset/, andconfigs/training/subdirectories.ARGUMENTS.mdwhen the parser ortrain.pychanges.Future: Unified dataset config
The dataset blending config (
configs/dataset/) is designed to be framework-agnostic. In the future,this config schema can be unified across all training frameworks and promoted to
modelopt_recipes/dataset/for shared use across examples.Test plan
pytest tests/unit/torch/opt/plugins/test_modelopt_arg_parser.pypytest tests/examples/llm_qat/test_llm_qat.pypre-commit run --all-filesaccelerate launch ... train.py --config configs/training/qat_nvfp4.yamlon 2xGPU🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Tests
Chores