[simple_fsdp] apply bucketing ag/rs passes, reordering collectives, sink by IvanKobzarev · Pull Request #1464 · pytorch/torchtitan

IvanKobzarev · 2025-07-25T16:28:31Z

Depends on landing pytorch PR
pytorch/pytorch#158663

waits

wconstab · 2025-07-25T16:42:53Z

torchtitan/experiments/simple_fsdp/parallelize.py

        logger.info("Applied Data Parallel (dp mode=%s) to the model", dp_mode)

    if job_config.training.compile:
+        from torch._inductor.comms import (


nit: you don't need to import. you can pass string names of the passes instead, e.g. passes = ["sink_waits_iterative", ...]

wconstab · 2025-07-25T16:44:34Z

torchtitan/experiments/simple_fsdp/parallelize.py

+            sink_waits_iterative,
+        )
+
+        torch._inductor.config.allow_buffer_reuse = False


i think since this will change the behavior of compile for non-simplefsdp cases, we need to understand the impact before we can land. Or, we can introduce a new option for compile_and_optimize_comms that bundles all these things together.

In my autoparallel branch I preferred to expose the manual controls directly, mostly to allow experimentation. But for general consumption i think it is nicer to enable the right 'recipe' with a simple switch, so i like your approach.

i think since this will change the behavior of compile for non-simplefsdp cases

@wconstab
This is in the simple_fsdp folder for SimpleFSDP case alone. Why do you think it would change the behavior of non-simplefsdp cases?

oops, i was just not paying attention to which file this was, i assumed it was the core parallelize file. disregard my comment.

ruisizhang123 · 2025-07-25T20:03:42Z

torchtitan/experiments/simple_fsdp/parallelize.py

+
+        torch._inductor.config.bucket_all_gathers_fx = "fsdp"
+        torch._inductor.config.bucket_reduce_scatters_fx = "fsdp"
+        torch._inductor.config.reorder_for_compute_comm_overlap = True


Thank you for adding this. It would be nice if you could also update the readme and detail what these configs are doing!

tianyu-l

Please see inline comments. I think it'd good if we can keep the passes configurable.

tianyu-l · 2025-07-25T21:10:33Z

torchtitan/experiments/simple_fsdp/parallelize.py

+            sink_waits_iterative,
+        )
+
+        torch._inductor.config.allow_buffer_reuse = False


i think since this will change the behavior of compile for non-simplefsdp cases

@wconstab
This is in the simple_fsdp folder for SimpleFSDP case alone. Why do you think it would change the behavior of non-simplefsdp cases?

tianyu-l · 2025-07-25T21:13:02Z

torchtitan/experiments/simple_fsdp/parallelize.py

+        torch._inductor.config.allow_buffer_reuse = False
        torch._inductor.config.reorder_for_peak_memory = False
+        torch._inductor.config.reorder_for_compute_comm_overlap = False
+
+        torch._inductor.config.bucket_all_gathers_fx = "fsdp"
+        torch._inductor.config.bucket_reduce_scatters_fx = "fsdp"
+        torch._inductor.config.reorder_for_compute_comm_overlap = True
+        torch._inductor.config.reorder_for_compute_comm_overlap_passes = [
+            sink_waits_iterative,
+            reorder_communication_preserving_peak_memory,
+        ]


I think we should make this configurable. Although I agree they can be turned on be default, I think it's still valuable if users can turn these passes off:

Some researchers may appreciate a plain graph without reordering. This was one of the origin motivation to add this experiment here.

The passes added here may not always be stable? If they cause failures, we should have the fallback to not enable them.

IvanKobzarev · 2025-07-25T21:15:32Z

Please see inline comments. I think it'd good if we can keep the passes configurable.

Will already added options in config manager - will close this pr

tianyu-l · 2025-07-25T21:55:06Z

@IvanKobzarev

Will already added options in config manager - will close this pr

Could you elaborate this a bit? I haven't seen @wconstab 's change in the simple_fsdp folder of torchtitan. Did he add it somewhere else which I'm not aware of?

Also for future reference, for such changes please consider presenting performance report in PR summary.

IvanKobzarev · 2025-07-25T22:08:43Z

@IvanKobzarev

Will already added options in config manager - will close this pr

Could you elaborate this a bit? I haven't seen @wconstab 's change in the simple_fsdp folder of torchtitan. Did he add it somewhere else which I'm not aware of?

Also for future reference, for such changes please consider presenting performance report in PR summary.

Sorry, that was not to the main, that was only to autoparallel branch - 68476b3

So for simplefsdp porting this to main should work too.

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 25, 2025

IvanKobzarev requested a review from wconstab July 25, 2025 16:28

[simple_fsdp] apply bucketing ag/rs passes, reordering collectives, sink

56196de

waits

IvanKobzarev force-pushed the simplefsdp_passes branch from a4b99bf to 56196de Compare July 25, 2025 16:30

wconstab reviewed Jul 25, 2025

View reviewed changes

ruisizhang123 reviewed Jul 25, 2025

View reviewed changes

tianyu-l requested changes Jul 25, 2025

View reviewed changes

IvanKobzarev closed this Jul 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[simple_fsdp] apply bucketing ag/rs passes, reordering collectives, sink#1464

[simple_fsdp] apply bucketing ag/rs passes, reordering collectives, sink#1464
IvanKobzarev wants to merge 1 commit intomainfrom
simplefsdp_passes

IvanKobzarev commented Jul 25, 2025

Uh oh!

wconstab Jul 25, 2025

Uh oh!

wconstab Jul 25, 2025

Uh oh!

tianyu-l Jul 25, 2025

Uh oh!

wconstab Jul 25, 2025

Uh oh!

ruisizhang123 Jul 25, 2025 •

edited

Loading

Uh oh!

tianyu-l left a comment

Uh oh!

tianyu-l Jul 25, 2025

Uh oh!

tianyu-l Jul 25, 2025

Uh oh!

IvanKobzarev commented Jul 25, 2025

Uh oh!

tianyu-l commented Jul 25, 2025

Uh oh!

IvanKobzarev commented Jul 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

IvanKobzarev commented Jul 25, 2025

Uh oh!

wconstab Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

wconstab Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

tianyu-l Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

wconstab Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

ruisizhang123 Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tianyu-l left a comment

Choose a reason for hiding this comment

Uh oh!

tianyu-l Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

tianyu-l Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

IvanKobzarev commented Jul 25, 2025

Uh oh!

tianyu-l commented Jul 25, 2025

Uh oh!

IvanKobzarev commented Jul 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ruisizhang123 Jul 25, 2025 •

edited

Loading