EDM is a novel network fabric design that achieves ultra-low latency memory disaggregation over Ethernet in datacenter environments. This project was published at ASPLOS 2025.
@inproceedings{su2025edm,
title={EDM: An Ultra-Low Latency Ethernet Fabric for Memory Disaggregation},
author={Su, Weigao and Shrivastav, Vishal},
booktitle={Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1},
year={2025}
}
Modern datacenters are moving towards disaggregated architectures where memory resources are separated from compute nodes. However, accessing remote memory over traditional Ethernet networks incurs significant latency overhead. EDM addresses this challenge through two key innovations:
-
PHY-Layer Network Stack: EDM implements the entire network stack for remote memory access within the Physical layer (PHY) of Ethernet, bypassing the traditional transport overhead.
-
Centralized Flow Scheduler: A fast, in-network memory flow scheduler operates in the switch’s PHY layer, creating dynamic virtual circuits between compute and memory nodes to eliminate queuing delays.
Performance
- ~300ns end-to-end latency in unloaded networks.
- Maintains latency within 1.3x of unloaded performance even under high network loads.
This repository contains three main components:
- FPGA Verification
- Verilog implementation for Xilinx Alveo U200. EDM-PHY
- Hardware Simulation
- YCSB workload and hw simulator. hwsimu
- Network Simulation
- Trace file generator based on real-world workloads. tracegen
- 144-node single rack network simulator. simulator
OS: Ubuntu 22.04
SW: Vivado 2023.2
HW: Xilinx Alevo U200 FPGA board
Dependency: Sklearn, Matplotlib, Pandas, Python >= 3.9
Please do
pip3 install pandas matplotlib scikit-learn
git clone https://github.com/wegul/EDM.git
cd EDM
git submodule update --init
Please follow instructions.
This section reproduces Figure-4,5 in artifact evaluation.
- Compile
cd EDM/hwsimu
mkdir -p build
cd build
cmake ..
make
- To generate traces, do
cd ../scripts
./gen_trace.sh
- To run the above two experiment and get results, do
./run_all.sh
- The final results and figures are in hwsimu/result. For convenience, the results of our paper is in hwsimu/golden.result. Since traces are randomly generated, there might be <10% variation.
We use YCSB dataset as kv-store memory traces to demonstrate EDM's bandwidth efficiency and end-to-end latency. For convenience, YCSB workloads are pre-acquired in EDM_simu/ycsb_raw_output.
In this experiment, we empirically calculated the overhead of inter-packet gap and header encapsulation in EDM and RDMA to infer theoretical bandwidth utilization in real world traces.
This experiment is based on the latency profile of EDM hardware testbed as well as a local DDR3 module on FPGA, with average access latency of ~82ns. In this experiment, we randomly allocate objects in YCSB traces into local and remote according to their addresses (keys). Since the distribution of object accesses in YCSB is zipfian, which will affect our final result for end-to-end latency, each raw trace will be shuffled 10 times. Also, error bars are added.
This section reproduces Figure-6 in artifact evaluation.
- Build:
First, build trace generator. Inside EDM/EDM-tracegen/ directory:
cd EDM/EDM-tracegen
autoconf
autoreconf -i
./configure
make
'autoconf' might throw an error. We can ignore and redo.
Next, generate traces. This will create EDM/netsimu/testdir
cd ./EDM-workload
./generate_traces.sh
- Compile and run: This will create EDM/netsimu/results/
cd ../../netsimu
./compile.sh
./run_tests.sh
Note:
i) Please do next step after all experiments are finished. You can check via screen -ls
.
ii) Our script runs every experiment in parallel and spawns >120 screen sessions. Your system might end up killing some of them and result in missing files. In case that happens, please contact us and we will provide access to our server.
- Collect results and plot:
./get_result.sh
The generated graphs are in EDM/netsimu/results/. Note that for mixed_* traces, the graph only include three groups because other two pure RREQ and pure WREQ are in rreq_result.csv and wreq_result.csv, respectively.
For convenience, averaged results of our submission is in EDM/netsimu/results/golden.result.