Support for modelopt with MoE QAT by HollowMan6 · Pull Request #3866 · NVIDIA-NeMo/Megatron-Bridge

HollowMan6 · 2026-05-17T21:25:07Z

What does this PR do ?

Move all the patches for supporting modelopt based QAT on verl side to Megatron Bridge.

Changelog

Referring to verl-project/verl@b96c8fb & verl-project/verl@04542a1

GitHub Actions CI

See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

Related to # (issue)

Copilot

Pull request overview

Adds support paths for ModelOpt/QAT MoE conversion behavior in Megatron Bridge, especially around local expert naming, quantized module detection, and related unit coverage.

Changes:

Extends expert-number extraction and sorting to handle local_experts.<N> names.
Updates conversion task filtering and EP name-globalization paths for adapter/quantizer handling.
Registers quantized parallel module type detection and adds QAT bridge support tests.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`tests/unit_tests/models/test_qat_bridge_support.py`	Adds unit coverage for local expert naming, quantized mappings, adapter filtering, and ModelOpt layer spec exposure.
`src/megatron/bridge/utils/common_utils.py`	Extends expert number extraction for local expert parameter names.
`src/megatron/bridge/models/conversion/utils.py`	Extends deterministic sort keys for SequentialMLP local experts.
`src/megatron/bridge/models/conversion/peft_bridge.py`	Adds shared base-weight export skip logic for adapters and quantizer params.
`src/megatron/bridge/models/conversion/param_mapping.py`	Registers quantized parallel linear module names and generalizes fused LayerNorm-column detection.
`src/megatron/bridge/models/conversion/model_bridge.py`	Updates local-to-global expert name handling and applies the new skip predicate in conversion task construction.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

HollowMan6 · 2026-05-17T21:44:03Z

/claude review

claude · 2026-05-17T21:48:28Z

Review — Support for modelopt with MoE QAT

Overall this looks solid. The separation of grouped-expert (weight N / bias N) vs. local-expert (local_experts.N) naming is clean and well-tested.

What I checked

model_bridge.py: The new is_local_expert_param / is_grouped_expert_param split correctly routes EP name globalization. The regex avoids matching quantizer params like weight_quantizer._amax. The replace(..., 1) count-limit on the local_experts branch prevents accidental double-replacement.
param_mapping.py: QuantColumnParallelLinear / QuantRowParallelLinear registry additions are straightforward. The substring-based _FUSED_LAYER_NORM_COLUMN_PARALLEL_SUBSTRING check is a good generalization.
utils.py: extract_sort_key correctly falls back to local_experts.N when no (weight|bias)N suffix is found.
common_utils.py: Pattern ordering in extract_expert_number_from_param is correct — local_experts is most specific and checked first.
Test file: 11 focused unit tests cover the key paths including adapter exclusion, quantizer param passthrough, EP gather with local experts, and the new parallelism detection for Quant modules.

Minor inline comment

One readability nit on operator precedence (not x in y -> x not in y) posted inline.

Suggested test cases

No perf tests impacted.

Refer to verl-project/verl@b96c8fb Signed-off-by: Hollow Man <hollowman@opensuse.org>

HollowMan6 · 2026-05-17T22:16:17Z

/claude review

claude · 2026-05-17T22:22:12Z

Code Review: Support for modelopt with MoE QAT

The changes cleanly extend expert param handling to support SequentialMLP local_experts.<N> naming alongside the existing grouped expert weight<N>/bias<N> naming, and add Quant*ParallelLinear types to the module registry. The regex work in _megatron_local_name_to_global is careful about avoiding false positives on quantizer params like weight_quantizer._amax.

Observations

Test coverage is solid. The new test file covers all the key code paths: local expert mapping, grouped expert mapping, adapter exclusion, quantizer exclusion, EP gather, parallelism type detection, and registry contents.
_FUSED_LAYER_NORM_COLUMN_PARALLEL_SUBSTRING is a good generalization. Switching from an exact class-name set to a substring check future-proofs against any *LayerNormColumnParallelLinear variant without needing registry updates.
Minor: _detect_parallelism_type test only covers the non-modelopt path. The test at line 159 creates a plain type(...) module, so is_modelopt_dynamic_module returns False. In the real modelopt scenario, the module is dynamic and get_original_cls_by_level(level=0).__name__ is used. The substring check works either way, but a test exercising the dynamic-module branch would strengthen confidence. Not a blocker.
gather_from_ep_ranks regex note (param_mapping.py:769-770). The pre-existing re.sub(r"experts\.(\d+)", ...) on HF param names works because HF params use experts.N (not local_experts.N). Worth noting this would need updating if any HF model family adopts local_experts naming.

Suggested test cases

No perf tests impacted.

Signed-off-by: Hollow Man <hollowman@opensuse.org>

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

Copilot AI review requested due to automatic review settings May 17, 2026 21:25

copy-pr-bot Bot temporarily deployed to public May 17, 2026 21:25 Inactive

Copilot started reviewing on behalf of HollowMan6 May 17, 2026 21:25 View session

copy-pr-bot Bot temporarily deployed to test May 17, 2026 21:25 Inactive

Copilot AI reviewed May 17, 2026

View reviewed changes

Comment thread src/megatron/bridge/models/conversion/peft_bridge.py Outdated

Comment thread src/megatron/bridge/models/conversion/model_bridge.py Outdated

copy-pr-bot Bot temporarily deployed to public May 17, 2026 21:33 Inactive

HollowMan6 force-pushed the qat_patch branch from 0c99b88 to 4335955 Compare May 17, 2026 21:43

copy-pr-bot Bot temporarily deployed to public May 17, 2026 21:44 Inactive

copy-pr-bot Bot temporarily deployed to test May 17, 2026 21:45 Inactive

claude Bot reviewed May 17, 2026

View reviewed changes

Comment thread src/megatron/bridge/models/conversion/model_bridge.py Outdated

copy-pr-bot Bot temporarily deployed to public May 17, 2026 21:51 Inactive

copy-pr-bot Bot temporarily deployed to public May 17, 2026 21:52 Inactive

copy-pr-bot Bot temporarily deployed to public May 17, 2026 22:07 Inactive

Support for modelopt with MoE QAT

a508708

Refer to verl-project/verl@b96c8fb Signed-off-by: Hollow Man <hollowman@opensuse.org>

HollowMan6 force-pushed the qat_patch branch from 4335955 to a508708 Compare May 17, 2026 22:14

copy-pr-bot Bot temporarily deployed to public May 17, 2026 22:15 Inactive

copy-pr-bot Bot temporarily deployed to test May 17, 2026 22:16 Inactive

copy-pr-bot Bot temporarily deployed to public May 17, 2026 22:22 Inactive

Address claude review

06a8aa4

Signed-off-by: Hollow Man <hollowman@opensuse.org>

HollowMan6 requested a review from Copilot May 17, 2026 22:36

copy-pr-bot Bot temporarily deployed to public May 17, 2026 22:37 Inactive

Copilot started reviewing on behalf of HollowMan6 May 17, 2026 22:37 View session

copy-pr-bot Bot temporarily deployed to test May 17, 2026 22:37 Inactive

Copilot AI reviewed May 17, 2026

View reviewed changes

copy-pr-bot Bot temporarily deployed to public May 17, 2026 22:45 Inactive

copy-pr-bot Bot temporarily deployed to public May 17, 2026 22:59 Inactive

yaoyu-33 added area:quant Quantization (PTQ, QAT, FP8 recipes) feature New capabilities, enhancements, or enablement work needs-review PR is ready for code review and waiting on a reviewer labels May 17, 2026

yaoyu-33 approved these changes May 20, 2026

View reviewed changes

yaoyu-33 merged commit a75738a into NVIDIA-NeMo:main May 20, 2026
101 checks passed

HollowMan6 deleted the qat_patch branch May 20, 2026 01:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for modelopt with MoE QAT#3866

Support for modelopt with MoE QAT#3866
yaoyu-33 merged 2 commits into
NVIDIA-NeMo:mainfrom
HollowMan6:qat_patch

HollowMan6 commented May 17, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

HollowMan6 commented May 17, 2026

Uh oh!

Uh oh!

claude Bot commented May 17, 2026

Uh oh!

HollowMan6 commented May 17, 2026

Uh oh!

claude Bot commented May 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

HollowMan6 commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Changelog

GitHub Actions CI

Before your PR is "Ready for review"

Additional Information

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

HollowMan6 commented May 17, 2026

Uh oh!

Uh oh!

claude Bot commented May 17, 2026

Review — Support for modelopt with MoE QAT

Uh oh!

HollowMan6 commented May 17, 2026

Uh oh!

claude Bot commented May 17, 2026

Code Review: Support for modelopt with MoE QAT

Observations

Suggested test cases

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HollowMan6 commented May 17, 2026 •

edited

Loading