[TorchComms] update readme (#1877)

mori360 · web-flow · commit 6bccdb652847 · 2025-10-14T20:51:45.000-07:00
nd parallelism is available after #1876 update readme, and fix the font issue
diff --git a/torchtitan/experiments/torchcomms/README.md b/torchtitan/experiments/torchcomms/README.md
@@ -1,25 +1,34 @@
 ## TorchTitan & TorchComms Composability Testing
 
-#### Overview
+### Overview
 
-This folder provides a framework for composability testing with TorchComms and distributed training in TorchTitan. The goal is to enable flexible experimentation with distributed communication primitives and parallelism strategies in PyTorch.
-TODO: add more explanation once the torchcomm goes public.
----
-#### Example
+This folder provides a framework for composability testing with TorchComms and distributed training in TorchTitan. It enables flexible experimentation with distributed communication primitives and various parallelism strategies in PyTorch.
+
+> **TODO:** Additional documentation will be provided once TorchComms is publicly released.
+
+### Quick Start
+
+The following command uses Llama 3 as an example:
 
-The command below uses Llama 3 as an example, but should work on all models.
 ```bash
 TEST_BACKEND=nccl TRAIN_FILE=torchtitan.experiments.torchcomms.train CONFIG_FILE="./torchtitan/models/llama3/train_configs/debug_model.toml" ./run_train.sh
 ```
----
-### Available Features
-- **Distributed Training Utilities**
-  - Training with `torchcomms.new_comm`
-  - Device mesh initialization with `torchcomms.init_device_mesh`
-- **Composability Testing**
-  - Integration and testing with `fully_shard` (FSDP)
----
-### To Be Added
-- Integration and testing with additional parallelism strategies (e.g., tensor, pipeline, context parallelism) other than fully_shard
-- Integration and testing with torch.compile
----
+
+### Features
+
+#### Distributed Training Utilities
+- Custom communicator backend initialization via `torchcomms.new_comm`
+- Compose torchcomms with DeviceMesh via the wrapper API `torchcomms.init_device_mesh`
+
+#### Parallelism Support
+Locally tested with:
+- **FSDP** (`fully_shard`) - Fully Sharded Data Parallel
+- **TP** - Tensor Parallelism
+- **PP** - Pipeline Parallelism
+- **CP** - Context Parallelism
+
+### Roadmap
+
+- [ ] Add N-D parallelism E2E perf and convergence tests
+- [ ] Integrated and tested with Expert Parallelism
+- [ ] Integration and testing with `torch.compile`