Skip to content

feat: add support for DeepEP in Qwen3 parallelization logic#2392

Open
jordisassoon wants to merge 2 commits intopytorch:mainfrom
jordisassoon:main
Open

feat: add support for DeepEP in Qwen3 parallelization logic#2392
jordisassoon wants to merge 2 commits intopytorch:mainfrom
jordisassoon:main

Conversation

@jordisassoon
Copy link

@jordisassoon jordisassoon commented Feb 18, 2026

This PR allows to pass expert_parallel_comm_backend = "deepep" and launch experiments with DeepEP on Qwen3 model architectures. Default is expert_parallel_comm_backend = "standard".

Changes are duplicated logic from the deepseek_v3 args/model/infra implementation.

I ran some benchmarks to verify speed-up with expert_parallel_comm_backend = "deepep" compared to "standard", avergaging TPS after warmup:

image
Benchmark info

8xB200 GPUs

[parallelism]
data_parallel_replicate_degree = 1
data_parallel_shard_degree = 8
tensor_parallel_degree = 1
pipeline_parallel_degree = 1
context_parallel_degree = 1
expert_parallel_degree = 2

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 18, 2026
@jordisassoon jordisassoon marked this pull request as draft February 18, 2026 13:00
@jordisassoon jordisassoon marked this pull request as ready for review February 18, 2026 13:27
Copy link
Contributor

@shuhuayu shuhuayu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants