[inference] fix: Support MCore dev inference mode#3876
Conversation
|
/ok to test c2a1858 |
|
Light Code Review LGTM — clean, narrowly scoped fix for the MCore dev InferenceMode import breakage. A couple of minor observations (non-blocking):
Suggested test cases No perf tests impacted. |
|
/ok to test 4daaa99 |
481b647 to
22d897e
Compare
|
/ok to test 22d897e |
Signed-off-by: Yu Yao <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
22d897e to
34ab3ea
Compare
|
/ok to test 34ab3ea |
|
CI triage: unrelated/main, not PR-caused. Evidence:
No patch from this PR is needed for the failure. The branch likely needs to be updated onto current |
Summary
Fixes the current MCore dev bump signal in #3871.
Target:
devFailure classification: Bridge-side compatibility issue across MCore variants.
Root cause:
main/mcore-devno longer exportsInferenceModefrommegatron.core.inference.utils, while Bridge VLM inference imported it directly.cache_positionentries in the Qwen3ASR thinker forward docstrings.Fix:
InferenceModewhen present and falls back to no-op methods when the dev commit does not expose it.cache_positionin the two Qwen3ASR forward docstrings flagged by the import check.Guards:
src/megatron/bridge/inference/vlm/_mcore_compat.py.InferenceModefrommegatron.core.inference.utils.Validation
2026-05-18 PDT:
11877695:uv run python -m pytest tests/unit_tests/inference/vlm/test_vlm_engine.py tests/unit_tests/inference/vlm/test_base.py -q->12 passed.11878377:uv run pre-commit run --all-files-> passed.uvx pre-commit run --files src/megatron/bridge/inference/vlm/_mcore_compat.py src/megatron/bridge/inference/vlm/vlm_engine.py tests/unit_tests/inference/vlm/test_vlm_engine.py src/megatron/bridge/models/qwen3_asr/hf_qwen3_asr/modeling_qwen3_asr.py-> passed.uvx ruff check ...,uvx ruff format --check ..., andgit diff --check-> passed.Local
uv run python -m pytest ...in this workstation checkout was blocked before pytest bycausal-conv1d==1.6.2.post1build isolation requiringtorchduring wheel build, so the authoritative pytest/pre-commit runs above were done on CW interactive with the CI container.