Skip to content

[inference] fix: Support MCore dev inference mode#3876

Merged
yaoyu-33 merged 2 commits into
mainfrom
yuya/mcore-dev-autofix-20260518-pr3871
May 20, 2026
Merged

[inference] fix: Support MCore dev inference mode#3876
yaoyu-33 merged 2 commits into
mainfrom
yuya/mcore-dev-autofix-20260518-pr3871

Conversation

@yaoyu-33
Copy link
Copy Markdown
Contributor

Summary

Fixes the current MCore dev bump signal in #3871.

Target: dev
Failure classification: Bridge-side compatibility issue across MCore variants.

Root cause:

  • main/mcore-dev no longer exports InferenceMode from megatron.core.inference.utils, while Bridge VLM inference imported it directly.
  • The unit/import check also flagged missing cache_position entries in the Qwen3ASR thinker forward docstrings.

Fix:

  • Add a narrow Bridge VLM compatibility shim that uses MCore InferenceMode when present and falls back to no-op methods when the dev commit does not expose it.
  • Route VLM engine code and its unit test through that shim.
  • Document cache_position in the two Qwen3ASR forward docstrings flagged by the import check.

Guards:

  • Added one MCore compatibility guard in src/megatron/bridge/inference/vlm/_mcore_compat.py.
  • Removal TODO: remove when Megatron-Core dev exposes InferenceMode from megatron.core.inference.utils.

Validation

2026-05-18 PDT:

  • CW interactive job 11877695: uv run python -m pytest tests/unit_tests/inference/vlm/test_vlm_engine.py tests/unit_tests/inference/vlm/test_base.py -q -> 12 passed.
  • CW interactive job 11878377: uv run pre-commit run --all-files -> passed.
  • Local: uvx pre-commit run --files src/megatron/bridge/inference/vlm/_mcore_compat.py src/megatron/bridge/inference/vlm/vlm_engine.py tests/unit_tests/inference/vlm/test_vlm_engine.py src/megatron/bridge/models/qwen3_asr/hf_qwen3_asr/modeling_qwen3_asr.py -> passed.
  • Local: uvx ruff check ..., uvx ruff format --check ..., and git diff --check -> passed.

Local uv run python -m pytest ... in this workstation checkout was blocked before pytest by causal-conv1d==1.6.2.post1 build isolation requiring torch during wheel build, so the authoritative pytest/pre-commit runs above were done on CW interactive with the CI container.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 18, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@yaoyu-33
Copy link
Copy Markdown
Contributor Author

/ok to test c2a1858

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 18, 2026

Light Code Review

LGTM — clean, narrowly scoped fix for the MCore dev InferenceMode import breakage.

A couple of minor observations (non-blocking):

  1. _mcore_compat.py error guard: The InferenceMode not in str(exc) check on L20 correctly re-raises unrelated ImportErrors. However, if megatron.core.inference.utils itself fails to import due to a transitive dependency issue whose error message happens to contain the string InferenceMode, the exception would be silently swallowed. This is an unlikely edge case and the current approach is pragmatic, but worth noting for the TODO removal later.

  2. Docstring additions in modeling_qwen3_asr.py: The cache_position docstrings at L996-998 and L1179-1180 are correct and match the HF convention.

Suggested test cases

No perf tests impacted.

@yaoyu-33 yaoyu-33 added area:model Model implementations and HF bridge logic bug Something isn't working full-test-suite needs-review PR is ready for code review and waiting on a reviewer labels May 18, 2026
@yaoyu-33
Copy link
Copy Markdown
Contributor Author

/ok to test 4daaa99

@yaoyu-33
Copy link
Copy Markdown
Contributor Author

/ok to test 22d897e

yaoyu-33 added 2 commits May 19, 2026 13:10
Signed-off-by: Yu Yao <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
@yaoyu-33 yaoyu-33 force-pushed the yuya/mcore-dev-autofix-20260518-pr3871 branch from 22d897e to 34ab3ea Compare May 19, 2026 20:12
@yaoyu-33
Copy link
Copy Markdown
Contributor Author

/ok to test 34ab3ea

@yaoyu-33
Copy link
Copy Markdown
Contributor Author

CI triage: unrelated/main, not PR-caused.

Evidence:

No patch from this PR is needed for the failure. The branch likely needs to be updated onto current main to pick up #3894 before rerunning CI. I did log-only triage and did not run local unit tests.

@yaoyu-33 yaoyu-33 merged commit 0a21da4 into main May 20, 2026
153 of 156 checks passed
@yaoyu-33 yaoyu-33 deleted the yuya/mcore-dev-autofix-20260518-pr3871 branch May 20, 2026 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:model Model implementations and HF bridge logic bug Something isn't working full-test-suite needs-review PR is ready for code review and waiting on a reviewer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant