Skip to content

Pull requests: pytorch/torchtitan

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

add train file specification option to run_start.sh CLA Signed This label is managed by the Meta Open Source bot. fb-exported
#1652 opened Aug 28, 2025 by Shagun-G Loading…
code refactor : making key steps modular train_step() CLA Signed This label is managed by the Meta Open Source bot. fb-exported
#1650 opened Aug 28, 2025 by Shagun-G Loading…
[moe][compile] Turn capture_scalar_outputs off by default CLA Signed This label is managed by the Meta Open Source bot.
#1649 opened Aug 28, 2025 by xmfan Loading…
[RFC] Support full bf16 training CLA Signed This label is managed by the Meta Open Source bot.
#1646 opened Aug 27, 2025 by ebsmothers Loading…
Activation Checkpoint improvment CLA Signed This label is managed by the Meta Open Source bot.
#1645 opened Aug 27, 2025 by fegin Loading…
[BE] Move NoParallel to torchtitan.distributed CLA Signed This label is managed by the Meta Open Source bot.
#1641 opened Aug 26, 2025 by fegin Loading…
[WIP][DSV3] GroupedExperts weights conversion optimization CLA Signed This label is managed by the Meta Open Source bot.
#1639 opened Aug 25, 2025 by wwwjn Loading…
[WIP] DCP: Dequantization and expert grouping for DSv3 CLA Signed This label is managed by the Meta Open Source bot.
#1638 opened Aug 25, 2025 by saumishr Draft
[DO NOT REVIEW] debug fsdp2 checkpoint for uneven sharding CLA Signed This label is managed by the Meta Open Source bot.
#1635 opened Aug 25, 2025 by weifengpy Draft
add option to use synthetic input data CLA Signed This label is managed by the Meta Open Source bot.
#1632 opened Aug 25, 2025 by alfuyao1986 Loading…
[wip] Distributed Scion/Muon CLA Signed This label is managed by the Meta Open Source bot.
#1630 opened Aug 25, 2025 by rakkit Loading…
Enable multi rank safetensor consolidation CLA Signed This label is managed by the Meta Open Source bot.
#1625 opened Aug 22, 2025 by ankitageorge Loading…
allow expert_parallel wrapper to handel kwargs CLA Signed This label is managed by the Meta Open Source bot.
#1620 opened Aug 22, 2025 by rakkit Loading…
VLM: Onboarding native resolution, native aspect ratio, interleaved VLM training CLA Signed This label is managed by the Meta Open Source bot.
#1615 opened Aug 21, 2025 by lkhphuc Loading…
1 task done
Bump version to v0.1.1 CLA Signed This label is managed by the Meta Open Source bot.
#1606 opened Aug 20, 2025 by wwwjn Loading…
workarounds for all2all autograd issues that Ruisi ran into CLA Signed This label is managed by the Meta Open Source bot.
#1604 opened Aug 20, 2025 by bdhirsh Loading…
Wrap sync + a2a in a custom op CLA Signed This label is managed by the Meta Open Source bot. high priority module: activation checkpointing release blocking Issues that are blocking the milestone / release completion
#1597 opened Aug 19, 2025 by soulitzer Loading…
[WIP] Activation Offloading with Separate Stream CLA Signed This label is managed by the Meta Open Source bot.
#1591 opened Aug 18, 2025 by excelle08 Loading…
Update SAC config to force save instead of recompute CLA Signed This label is managed by the Meta Open Source bot.
#1589 opened Aug 18, 2025 by soulitzer Draft
[WIP][DSV3] Remove keep a copy of GroupedExperts weight, free memory in StateDictAdapter CLA Signed This label is managed by the Meta Open Source bot.
#1585 opened Aug 16, 2025 by wwwjn Loading…
Muon with 3D tensors CLA Signed This label is managed by the Meta Open Source bot.
#1584 opened Aug 16, 2025 by byronxu99 Loading…
Add config to AC to toggle early-stop and revert A2A autograd.Function workaround ci-no-td CLA Signed This label is managed by the Meta Open Source bot.
#1580 opened Aug 15, 2025 by soulitzer Loading…
[EP] add initial support for NVSHMEM-based all-to-all CLA Signed This label is managed by the Meta Open Source bot.
#1569 opened Aug 14, 2025 by tianyu-l Loading…
[Do Not Land] Debug for SDPA + CP nan issue in DeepSeekV3 CLA Signed This label is managed by the Meta Open Source bot.
#1566 opened Aug 13, 2025 by XilunWu Draft
Multinode SkyPilot example CLA Signed This label is managed by the Meta Open Source bot.
#1564 opened Aug 13, 2025 by alex000kim Loading…
ProTip! Follow long discussions with comments:>50.