Add Param2 model bridge by meghmak13 · Pull Request #3834 · NVIDIA-NeMo/Megatron-Bridge

meghmak13 · 2026-05-14T19:56:38Z

What does this PR do ?

This PR add model bridge and provide files for Param-2 17B
Add a one line overview of what this PR aims to accomplish.

Changelog

Add specific line by line info of high level changes in this PR.

GitHub Actions CI

See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

Related to # (issue)

copy-pr-bot · 2026-05-14T19:56:43Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

claude · 2026-05-14T19:59:30Z

@@ -0,0 +1,446 @@
+import contextlib
+import fnmatch


Missing NVIDIA copyright header. Per project rules, all new Python files (except under tests/) must include the Apache 2.0 copyright header with year 2026. Same applies to param_17B_provider.py.

claude · 2026-05-14T19:59:32Z

+from megatron.bridge.models.param2 import (
+  Param2ModelProvider  
+        )


Formatting issues: 2-space indent (should be 4), trailing whitespace after Param2ModelProvider, and missing trailing comma. Also, Param2Bridge should be imported here to match the pattern of other model bridges.

Suggested change

from megatron.bridge.models.param2 import (

Param2ModelProvider

)

from megatron.bridge.models.param2 import (

Param2Bridge,

Param2ModelProvider,

)

claude · 2026-05-14T19:59:36Z

+                f"produced by build_conversion_tasks()."
+            )
+
+        return filtered
+


Bug: moe_router_enable_expert_bias defaults to False in provider_bridge() (line 80) but to True in megatron_to_hf_config() here, and True in the provider dataclass. This asymmetry means an HF→Megatron→HF roundtrip will silently flip this value from False to True when the source config doesn't set it explicitly. The default should be consistent — likely True to match the provider class.

Suggested change

f"produced by build_conversion_tasks()."

)

return filtered

"moe_router_enable_expert_bias": bool(

getattr(provider, "moe_router_enable_expert_bias", False)

),

Or change the default in provider_bridge() to True to match the provider dataclass — pick one and be consistent.

claude · 2026-05-14T19:59:38Z

+                        ),
+                        AutoMapping(
+                            megatron_param=f"decoder.layers.{layer}.mlp.experts.linear_fc2.weight{expert}",
+                            hf_param=f"model.layers.{layer}.mlp.experts.{expert}.down_proj.weight",


Project rules prohibit bare print() — use logging.getLogger(__name__) or print_rank_0(). There's another bare print() at line 409 with the same issue.

claude · 2026-05-14T19:59:40Z

+__all__ = [
+    "Param2ModelProvider",
+]


Param2Bridge is imported (line 14) but not exported in __all__. Other model packages (DeepSeek, OlMoE, Sarvam, etc.) export both the bridge and provider.

Suggested change

__all__ = [

"Param2ModelProvider",

]

__all__ = [

"Param2Bridge",

"Param2ModelProvider",

]

claude · 2026-05-14T20:00:54Z

Review: Add Param2 model bridge

Thanks for the contribution! Found a few issues that should be addressed before merge:

Critical

Roundtrip bug -- moe_router_enable_expert_bias default mismatch: provider_bridge() defaults this to False, but megatron_to_hf_config() defaults it to True, and the provider dataclass also defaults to True. An HF-to-Megatron-to-HF roundtrip will silently flip this value when the source config does not set it. Pick one default and use it consistently.

Must fix

Missing NVIDIA copyright headers: Both param_17B_bridge.py and param_17B_provider.py are missing the required Apache 2.0 copyright header (year 2026 per project rules).
Bare print() calls: Two uses of bare print() in param_17B_bridge.py (lines 322 and 409). Project rules require logging.getLogger(name) or print_rank_0() instead.
Malformed import in models/init.py: The Param2 import block uses 2-space indent, has trailing whitespace, and is missing a trailing comma. See inline suggestion.
Param2Bridge not exported: Param2Bridge is imported in param2/init.py but missing from all. It is also not imported or listed in the top-level models/init.py. Other model families export both bridge and provider.

Observations

No unit tests are included for the new bridge/provider. At minimum, a weight-mapping roundtrip test would help catch issues like the default mismatch above.
The bridge hardcodes architecture constants (NUM_LAYERS = 21, NUM_EXPERTS = 64, etc.) as class attributes rather than reading from the provider/config. This works for a single model variant but will need refactoring if other Param2 sizes are added.

Suggested test cases

No perf tests impacted.

Add Param2 model bridge

3f774de

claude Bot reviewed May 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Param2 model bridge#3834

Add Param2 model bridge#3834
meghmak13 wants to merge 1 commit into
mainfrom
megh/param2

meghmak13 commented May 14, 2026

Uh oh!

copy-pr-bot Bot commented May 14, 2026

Uh oh!

claude Bot May 14, 2026

Uh oh!

claude Bot May 14, 2026

Uh oh!

claude Bot May 14, 2026

Uh oh!

claude Bot May 14, 2026

Uh oh!

claude Bot May 14, 2026

Uh oh!

claude Bot commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

meghmak13 commented May 14, 2026

What does this PR do ?

Changelog

GitHub Actions CI

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot Bot commented May 14, 2026

Uh oh!

claude Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 14, 2026

Review: Add Param2 model bridge

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants