You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This pr adds the autobucketing pass at aten-level to simplefsdp. It runs
autobucketing + aot_eager backend without inductor. The aten fx
autobucketing pass can be find in this PR:
pytorch/pytorch#163960.
Key updates are:
1. Support customized `aot_eger_autobucketing` backend to perform
autobucketing optimization.
2. In simplefsdp, the model_backend can be replaced by user's customized
passes using `compile.model_backend_override`.
This folder includes an experimental frontend implementation for [SimpleFSDP: Simpler Fully Sharded Data Parallel with torch.compile](https://arxiv.org/abs/2411.00284). SimpleFSDP is a compiler-based Fully Sharded Data Parallel (FSDP) framework, which has a simple implementation for maintenance and composability, allows full computation-communication graph tracing, and brings performance enhancement via compiler backend optimizations.
12
12
13
-
### Run SimpleFSDP Training on Llama 3
13
+
### Run SimpleFSDP Training on Llama3 & DeepSeek_v3
14
14
15
15
#### Training Llama3 models
16
16
@@ -42,6 +42,23 @@ Some of the features require the updates from PyTorch, with which we are working
SimpleFSDP relies on compiler backend to perform optimizations (i.e., bucketing & reordering) for good training performance. Currently, the following optimization passes are supported:
49
+
50
+
1. no optimization: default torch.compile backends (e.g., "inductor", "aot_eager", "eager")
51
+
52
+
2. auto optimization: perform auto-bucketing & reordering without user inputs. **Note: it is not guaranteed that users will get the most optimized training performance**
53
+
- "aot_eager_autobucketing": perform autobucketing at aten fx-level, and perform code execution with aot_eager backend.
54
+
55
+
56
+
users can specify the pass (e.g., "aot_eager_autobucketing") via addtional configs:
0 commit comments