Skip to content

Mamba MOE Quant Configs + Fix Export Bug#882

Open
jenchen13 wants to merge 4 commits intomainfrom
jennifchen/nmh_configs
Open

Mamba MOE Quant Configs + Fix Export Bug#882
jenchen13 wants to merge 4 commits intomainfrom
jennifchen/nmh_configs

Conversation

@jenchen13
Copy link
Contributor

@jenchen13 jenchen13 commented Feb 12, 2026

What does this PR do?

Type of change: ? Bug fix

Overview: ?

  • Fix a bug in MCore export exclude_modules where the layers had an extra period at the end
  • Add custom quant configs for mamba moes

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes/No
  • Did you write any new necessary tests?: Yes/No
  • Did you add or update any necessary documentation?: Yes/No
  • Did you update Changelog?: Yes/No

Additional Information

Summary by CodeRabbit

  • New Features

    • Added four new Mamba MOE quantization configurations: aggressive and conservative variants for both FP8 and NVFP4 quantization schemes, providing enhanced flexibility in quantization options for different use cases.
  • Bug Fixes

    • Improved quantization export module exclusion pattern handling to properly normalize trailing dots from exclude patterns during export.

@jenchen13 jenchen13 requested review from a team as code owners February 12, 2026 17:12
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 12, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 12, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds Mamba MOE-specific quantization configuration variants for FP8 and NVFP4 quantizers, and fixes export module exclusion pattern handling by stripping trailing dots from prefix names.

Changes

Cohort / File(s) Summary
Export Pattern Fix
modelopt/torch/export/unified_export_megatron.py
Modified exclude module pattern handling in _get_quantized_state to strip trailing dots from prefixes before recording exclusions, improving pattern consistency.
Mamba MOE Quantization Configs
modelopt/torch/quantization/config.py
Introduced new Mamba MOE-specific quantization configurations (MAMBA\_MOE\_FP8\_AGGRESSIVE\_CFG, MAMBA\_MOE\_FP8\_CONSERVATIVE\_CFG, MAMBA\_MOE\_NVFP4\_AGGRESSIVE\_CFG, MAMBA\_MOE\_NVFP4\_CONSERVATIVE\_CFG) with disabled MOE-related quantizers (fc1\_latent\_proj, fc2\_latent\_proj, q/k/v\_proj). Configurations extend existing families with selective mixer projection disabling in conservative NVFP4 variants.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses both main changes: the Mamba MOE quantization configurations and the export bug fix, accurately summarizing the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch jennifchen/nmh_configs

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Feb 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.74%. Comparing base (95511a0) to head (c9fb020).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #882   +/-   ##
=======================================
  Coverage   73.73%   73.74%           
=======================================
  Files         199      199           
  Lines       21165    21170    +5     
=======================================
+ Hits        15606    15611    +5     
  Misses       5559     5559           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@ChenhanYu ChenhanYu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatting test is failing.

@ChenhanYu ChenhanYu requested a review from meenchen February 12, 2026 17:36
@jenchen13 jenchen13 force-pushed the jennifchen/nmh_configs branch from 700c32d to 5b983e3 Compare February 12, 2026 18:39
@jenchen13 jenchen13 enabled auto-merge (squash) February 12, 2026 18:39
@ChenhanYu ChenhanYu disabled auto-merge February 13, 2026 00:12
Copy link
Contributor

@Fridah-nv Fridah-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>
Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>
Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>
Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>
@jenchen13 jenchen13 force-pushed the jennifchen/nmh_configs branch from 801881e to c9fb020 Compare February 13, 2026 17:25
@jenchen13 jenchen13 enabled auto-merge (squash) February 13, 2026 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants