-
Notifications
You must be signed in to change notification settings - Fork 685
Pull requests: pytorch/torchtitan
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
add option to use synthetic input data
CLA Signed
This label is managed by the Meta Open Source bot.
#1632
opened Aug 25, 2025 by
alfuyao1986
Loading…
Distributed Scion/Muon
CLA Signed
This label is managed by the Meta Open Source bot.
#1630
opened Aug 25, 2025 by
rakkit
Loading…
workarounds for all2all autograd issues that Ruisi ran into
CLA Signed
This label is managed by the Meta Open Source bot.
#1604
opened Aug 20, 2025 by
bdhirsh
Loading…
[WIP] Activation Offloading with Separate Stream
CLA Signed
This label is managed by the Meta Open Source bot.
#1591
opened Aug 18, 2025 by
excelle08
Loading…
Update SAC config to force save instead of recompute
CLA Signed
This label is managed by the Meta Open Source bot.
Muon with 3D tensors
CLA Signed
This label is managed by the Meta Open Source bot.
#1584
opened Aug 16, 2025 by
byronxu99
Loading…
[EP] add initial support for NVSHMEM-based all-to-all
CLA Signed
This label is managed by the Meta Open Source bot.
#1569
opened Aug 14, 2025 by
tianyu-l
Loading…
[Do Not Land] Debug for SDPA + CP nan issue in DeepSeekV3
CLA Signed
This label is managed by the Meta Open Source bot.
fix: remove redundant legacy usage of mp in checkpoint
CLA Signed
This label is managed by the Meta Open Source bot.
#1562
opened Aug 13, 2025 by
yzs981130
Loading…
[PoC] Enable flexible different layout for same mesh via a util function
CLA Signed
This label is managed by the Meta Open Source bot.
#1550
opened Aug 11, 2025 by
fduwjj
Loading…
Adding logic for cleaning up FT checkpoints
CLA Signed
This label is managed by the Meta Open Source bot.
#1528
opened Aug 5, 2025 by
bentherien
Loading…
[WIP][Dion Official Optimizer, Muon] Integrate official Dion, and high speed Muon, optimizer impl with TorchTitan and Optimizer component class
CLA Signed
This label is managed by the Meta Open Source bot.
Fix semi-sync training with 1GPU per FT replica
CLA Signed
This label is managed by the Meta Open Source bot.
#1505
opened Jul 31, 2025 by
bentherien
Loading…
perf testing
CLA Signed
This label is managed by the Meta Open Source bot.
#1488
opened Jul 29, 2025 by
ankitageorge
•
Draft
[Evaluation] Adding evaluation feature to TorchTitan
CLA Signed
This label is managed by the Meta Open Source bot.
#1470
opened Jul 28, 2025 by
raymin0223
Loading…
[autoparallel] Enable bucketing passes for autoparallel, reorder and sink_waits.
CLA Signed
This label is managed by the Meta Open Source bot.
#1463
opened Jul 25, 2025 by
IvanKobzarev
Loading…
Autoparallel support for DP-only, DP+TP, or TP-only
CLA Signed
This label is managed by the Meta Open Source bot.
#1459
opened Jul 25, 2025 by
IvanKobzarev
Loading…
[WIP] Integrate autoparallel into torchtitan
CLA Signed
This label is managed by the Meta Open Source bot.
#1458
opened Jul 25, 2025 by
IvanKobzarev
Loading…
add lr logging
CLA Signed
This label is managed by the Meta Open Source bot.
#1453
opened Jul 24, 2025 by
samsja
Loading…
[torchtitan] TorchFunctionMode + SAC issue
CLA Signed
This label is managed by the Meta Open Source bot.
#1434
opened Jul 21, 2025 by
XilunWu
Loading…
[torchtitan] CP + SDPA issue reproduce
CLA Signed
This label is managed by the Meta Open Source bot.
#1432
opened Jul 21, 2025 by
XilunWu
Loading…
[WIP][Kernels] Add Quack (Cutlass 4.0) RMSNorm
CLA Signed
This label is managed by the Meta Open Source bot.
add float8 support
CLA Signed
This label is managed by the Meta Open Source bot.
#1378
opened Jul 10, 2025 by
bdhirsh
Loading…
ProTip!
Exclude everything labeled
bug with -label:bug.