Skip to content

Commit 50c1b05

Browse files
ge0405facebook-github-bot
authored andcommitted
Benchmark: correct links and EBC comparison chart (pytorch#445)
Summary: Pull Request resolved: pytorch#445 Corrected links in README Reviewed By: colin2328 Differential Revision: D37211592 fbshipit-source-id: 8293825c95e836e8b407153191ef0f185a1c61e9
1 parent 5bd37b3 commit 50c1b05

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed
453 KB
Loading

benchmarks/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
We evaluate the performance of two EmbeddingBagCollection modules:
44

5-
1. `EmbeddingBagCollection` (EBC) ([code](https://github.com/pytorch/torchrec/blob/main/torchrec/modules/embedding_modules.py#L67)): a simple module backed by [torch.nn.EmbeddingBag](https://pytorch.org/docs/stable/generated/torch.nn.EmbeddingBag.html).
5+
1. `EmbeddingBagCollection` (EBC) ([code](https://pytorch.org/torchrec/torchrec.modules.html#torchrec.modules.embedding_modules.EmbeddingBagCollection)): a simple module backed by [torch.nn.EmbeddingBag](https://pytorch.org/docs/stable/generated/torch.nn.EmbeddingBag.html).
66

77
2. `FusedEmbeddingBagCollection` (Fused EBC) ([code](https://github.com/pytorch/torchrec/blob/main/torchrec/modules/fused_embedding_bag_collection.py#L299)): a module backed by [FBGEMM](https://github.com/pytorch/FBGEMM) kernels which enables more efficient, high-performance operations on embedding tables. It is equipped with a fused optimizer, and UVM caching/management that makes much larger memory available for GPUs.
88

@@ -17,7 +17,7 @@ embedding_dim_size = 128
1717

1818
Other setup includes:
1919
- Optimizer: Stochastic Gradient Descent (SGD)
20-
- Dataset: Random dataset ([code](https://github.com/pytorch/torchrec/blob/main/torchrec/datasets/random.py))
20+
- Dataset: Random dataset ([code](https://pytorch.org/torchrec/torchrec.datasets.html#module-torchrec.datasets.random))
2121
- CUDA 11.7, NCCL 2.11.4.
2222
- AWS EC2 instance with 8 16GB NVIDIA Tesla V100
2323

@@ -62,7 +62,7 @@ Here, we demonstrate the advantage of UVM/UVM-caching with Fused EBC. With UVM c
6262
|Fused EBC with UVM | 0.62 (+/- 5.34) second | full sized DLRM EMB |
6363

6464
The above performance comparison is also put in a bar chart for better visualization.
65-
![EBC_benchmarks_dlrm_emb](EBC_benchmarks_dlrm_emb.png)
65+
![EBC_benchmarks_dlrm_emb](https://github.com/pytorch/torchrec/tree/main/benchmarks/EBC_benchmarks_dlrm_emb.png)
6666

6767

6868
### 3. Comparison between EBC and fused_EBC on different sized embedding tables (`ebc_comparison_scaling`)

0 commit comments

Comments
 (0)