A comprehensive benchmark suite measuring the trade-off between graph resolution and learning efficiency in Electronic Design Automation (EDA), with a focus on the Placement–Clock Tree Synthesis (CTS) interface.
CTS-Bench contains 4,860 converged physical design solutions across five hardware architectures, providing paired Raw (gate-level) and Clustered graph representations with ground-truth CTS metrics.
- Total Data Points: 4,860 designs
- Architectures: PicoRV32, AES, SHA256, EthMAC (training), Zipdiv (zero-shot test)
- Placements per Architecture: 486 unique placements (except Zipdiv: 500)
- CTS Variants per Placement: 10 distinct clock trees
- Graph Representations: Raw gate-level + Physics-aware clustered (13.3× compression)
- Metrics: 15 ground-truth QoR measurements per run
Located in dataset_root/:
-
clocknet_unified_manifest.csv- Main training dataset- 3,860 samples from PicoRV32, AES, SHA256, EthMAC
- 80/20 train/validation split for in-distribution evaluation
- Includes: placement parameters, CTS knobs, 15 QoR metrics, graph paths
-
clocknet_unified_manifest_test.csv- Zero-shot generalization test set- 500 samples from Zipdiv architecture
- Structurally distinct from training designs
- Evaluates out-of-distribution generalization capability
Both manifests contain:
| Column Category | Content |
|---|---|
| Metadata | design_id, placement_seed, cts_variant |
| Placement Knobs | synth_strategy, aspect_ratio, io_mode, core_utilization, target_density, time_driven, routability_driven |
| CTS Knobs | sink_max_dia, max_wire_length, cluster_size, buffer_distance |
| QoR Metrics | skew_setup, skew_hold, slack, tns, total_power, dynamic_power, static_power, wirelength, cell_utilization, clock_buffers, inverters, routing_congestion, repair_buffers |
| Gap Metrics | pareto_gap_skew, pareto_gap_power, pareto_gap_wirelength, total_pareto_distance |
| Graph Paths | path_raw_graph, path_clustered_graph |
- Node Type: Standard cells (flip-flops + combinational logic)
- Node Count: 5,000–40,000 nodes per design
- Node Features:
- Normalized x, y coordinates
- is_ff (binary: flip-flop indicator)
- log-transformed switching activity (from SAIF)
- Edge Features: Manhattan distance between connected cells
- Format: PyTorch Geometric tensors
- Compression Ratio: ~13.3× average reduction
- Clustering Algorithm:
- Atomic cluster formation (BFS from flip-flops)
- High-spread filtering (preserve geometric diversity)
- Gravity-vector-aligned merging (electrical domain awareness)
- Node Count: 400–3,000 nodes per design
- Node Features (10 dimensions):
- Cluster centroid (x, y)
- Spatial spread (σ_x, σ_y)
- log-transformed cluster size
- Composition (N_FF, N_Logic)
- Aggregated switching activity (max, sum, toggle counts)
- Edge Features: Manhattan distance between cluster centroids
- Format: PyTorch Geometric tensors
CTS-Knob-Aware-placement/
├── dataset_root/ # Main data directory
│ ├── clocknet_unified_manifest.csv # Training manifest (3,860 rows)
│ ├── clocknet_unified_manifest_test.csv # Test manifest (500 rows, Zipdiv)
│ ├── aes_batch1-3.csv # AES design subsets
│ ├── ethmac_batch1-4.csv # EthMAC design subsets
│ ├── picorv32_batch1-2.csv # PicoRV32 design subsets
│ ├── sha256_batch1-4.csv # SHA256 design subsets
│ ├── test_zipdiv.csv # Zipdiv test data
│ ├── graphs/ # Graph files (raw + clustered)
│ │ ├── aes/
│ │ ├── ethmac/
│ │ ├── picorv32/
│ │ ├── sha256/
│ │ └── zipdiv/
│ └── experiment_log.csv # Training run logs
├── designs/ # RTL & design specifications
│ ├── aes/, ethmac/, picorv32/, sha256/, zipdiv/
│ ├── base.sdc, primitives.v, sky130_fd_sc_hd.v
├── clustering_outputs_GMM/ # Clustering analysis
│ ├── 00_AGGREGATED_COUNTS/
│ └── [per-design directories]
├── fidelity_outputs/ # Fidelity evaluation results
│ ├── 00_ALL_DESIGNS/
│ └── [per-design directories]
├── correlation_outputs/ # Correlation analysis
│ ├── 00_ALL_DESIGNS/
│ └── [per-design directories]
├── stats_outputs/ # Statistical summaries
├── runs/ # Training experiment runs
│ └── zipdiv_run_YYYYMMDD_HHMMSS/ # Timestamped results
├── scripts/ # Generation & analysis scripts
├── data_analysis/ # Visualization & metric analysis
│ ├── raw-mae-chart.py
│ ├── clustered-mae-chart.py
│ ├── clustering_fidelity_check.py
│ ├── variance.py
│ └── [other analysis scripts]
├── ml-benchmark/ # ML training code
│ └── ml.py # Multi-modal GNN training
└── requirements.txt # Python dependencies
Individual design data files are available for granular analysis:
dataset_root/
├── aes_batch1.csv, aes_batch2.csv, aes_batch3.csv
├── ethmac_batch1.csv, ethmac_batch2.csv, ethmac_batch3.csv, ethmac_batch4.csv
├── picorv32_batch1.csv, picorv32_batch2.csv
├── sha256_batch1.csv, sha256_batch2.csv, sha256_batch3.csv, sha256_batch4.csv
└── test_zipdiv.csv
Each file contains the same column structure as the main manifest but filtered for that specific architecture.
- VRAM Reduction: 17.2× (Raw → Clustered)
- Training Speedup: 3×
- Memory Usage:
- Raw GATv2: 1.2–2.5 GB/batch
- Clustered GATv2: ~70–150 MB/batch
import pandas as pd
import torch_geometric
# Load main training manifest
df_train = pd.read_csv('dataset_root/clocknet_unified_manifest.csv')
print(f"Training samples: {len(df_train)}")
print(f"Columns: {df_train.columns.tolist()}")
# Load zero-shot test set
df_test = pd.read_csv('dataset_root/clocknet_unified_manifest_test.csv')
print(f"Test samples (Zipdiv): {len(df_test)}")
# Access graph paths
raw_graph_path = df_train.iloc[0]['path_raw_graph']
clustered_graph_path = df_train.iloc[0]['path_clustered_graph']
# Load graph data (example using PyTorch Geometric)
raw_data = torch.load(f'dataset_root/{raw_graph_path}')
clustered_data = torch.load(f'dataset_root/{clustered_graph_path}')
print(f"Raw graph nodes: {raw_data.num_nodes}, edges: {raw_data.num_edges}")
print(f"Clustered graph nodes: {clustered_data.num_nodes}, edges: {clustered_data.num_edges}")# Get only AES data
df_aes = pd.read_csv('dataset_root/aes_batch1.csv')
# Get only PicoRV32 data
df_picorv32 = pd.read_csv('dataset_root/picorv32_batch1.csv')
# Combine multiple batches
df_sha256 = pd.concat([
pd.read_csv('dataset_root/sha256_batch1.csv'),
pd.read_csv('dataset_root/sha256_batch2.csv'),
pd.read_csv('dataset_root/sha256_batch3.csv'),
pd.read_csv('dataset_root/sha256_batch4.csv'),
])# Analyze skew metrics
skew_seen = df_train['skew_setup'].describe()
print(f"Skew statistics (seen architectures):\n{skew_seen}")
# Compute Pareto efficiency
df_train['is_pareto_optimal'] = (
(df_train['pareto_gap_skew'] == 1.0) &
(df_train['pareto_gap_power'] == 1.0) &
(df_train['pareto_gap_wirelength'] == 1.0)
)
print(f"Pareto optimal designs: {df_train['is_pareto_optimal'].sum()}")
# Compare raw vs clustered graph compression
df_train['compression_ratio'] = (
df_train['raw_node_count'] / df_train['clustered_node_count']
)
print(f"Average compression: {df_train['compression_ratio'].mean():.1f}×")Refer to ml-benchmark/ml.py for training multi-modal GNN models on CTS-Bench.
Training Configuration:
- Models: GCN, GraphSAGE, GATv2
- Task: Multi-target regression (Skew, Power, Wirelength)
- Epochs: 100
- Batch Size: 32
- Hardware: NVIDIA RTX 5060 (8GB VRAM)
Located in data_analysis/:
raw-mae-chart.py- Visualize MAE and R² for raw graphsclustered-mae-chart.py- Visualize MAE and R² for clustered graphsclustering_fidelity_check.py- Evaluate clustering qualityvariance.py- Variance analysis across designsvisualize-correlations.py- Metric correlation heatmaps
Last Updated: February 2026
To regenerate the complete CTS-Bench dataset from scratch, follow these steps:
- Nix-shell with OpenLane2, OpenROAD, TritonCTS, and Sky130 PDK
- Python 3.8+ with required dependencies
-
Enter the OpenLane nix-shell (containerized environment): Go to your openlane2 directory and
nix-shell
-
Clone the repository:
git clone <repo-url> cts-bench cd cts-bench
-
Set up Python virtual environment:
python -m venv venv source venv/bin/activate -
Install Python dependencies:
pip install -r requirements.txt
-
Run the full pipeline:
chmod +x run_scripts.sh ./run_scripts.sh
This executes:
scripts/1-gen-placement.py- Generate placements with randomized knobsscripts/2-gen-saif.py- Extract switching activity (SAIF)scripts/4-graph_to_cluster.py- Build raw and clustered graph representationsscripts/5-run-cts.py- Run multi-variant clock tree synthesisscripts/6-parse-cts-reports.py- Extract QoR metrics
| Stage | Script | Output |
|---|---|---|
| 1 | 1-gen-placement.py |
Placement DEF files + knob configurations |
| 2 | 2-gen-saif.py |
SAIF switching activity data |
| 3 | 4-graph_to_cluster.py |
Raw & Clustered graph tensors (PyTorch Geometric) |
| 4 | 5-run-cts.py |
Clock tree netlists + STA reports |
| 5 | 6-parse-cts-reports.py |
QoR metrics (Skew, Power, Wirelength, etc.) |
To add your own design to CTS-Bench:
-
Create a new directory in
designs/:mkdir -p designs/my_design cd designs/my_design -
Create
rtl/andtb/subdirectories:mkdir rtl tb
-
Add your RTL code to
rtl/:cp my_design.v rtl/
Ensure the module name matches your design directory name.
-
Add your testbench to
tb/:cp my_testbench.v tb/testbench.v
The testbench should:
- Match the module interface of your design
- Produce valid switching activity for SAIF generation
- Use a clock signal named according to your specification
Run the placement generation script with your design:
# From the repository root (within venv)
python scripts/1-gen-placement.pyThen call the placement generation function for your design:
from scripts.script_1_gen_placement import run_single_experiment
# Parameters: (design_name, clock_period_ns, clock_port_name)
run_single_experiment(
design_name="my_design",
clock_period=10.0, # in nanoseconds
clock_port_name="clk" # your clock signal name
)This generates 486 unique placements by randomizing the knobs in Table 1 of the paper.
Run the CTS generation script:
python scripts/5-run-cts.pyThen call the CTS function for your design:
from scripts.script_5_run_cts import run_cts_from_placement
# Parameters: (design_name, clock_period_ns, clock_port_name)
run_cts_from_placement(
design_name="my_design",
clock_period=10.0, # must match placement step
clock_port_name="clk" # must match placement step
)This generates 10 CTS variants per placement (4,860 total data points for your design).
Run the remaining stages (or use ./run_scripts.sh to run all at once):
# Extract switching activity
python scripts/2-gen-saif.py
# Construct graphs (raw and clustered)
python scripts/4-graph_to_cluster.py
# Run CTS variants
python scripts/5-run-cts.py
# Parse CTS reports and extract metrics
python scripts/6-parse-cts-reports.pyOr run all at once with your specified number of iterations:
python main.pyWhen adding a custom design, consider these parameters:
When adding a custom design, consider these parameters:
| Parameter | Example | Notes |
|---|---|---|
| Design Name | my_design | Used for folder naming; must be unique |
| Clock Period | 10.0 ns | Determines timing constraints; adjust for your design frequency |
| Clock Port | clk | Must match your RTL module's clock signal name |
| Gate Count | 1,000–100,000 | Affects graph size and training efficiency |
| Cell Types | Flip-flops, gates | Standard cells from Sky130 PDK |
After completion, you'll have:
dataset_root/
├── my_design_batch1.csv # QoR metrics for your design
└── graphs/my_design/
├── placement_0/
│ ├── raw_0.pt # Raw graph for placement variant 0
│ ├── clustered_0.pt # Clustered graph for placement variant 0
│ └── ... # (10 CTS variants each)
└── ... # (486 placements total)
This merges all design batches into clocknet_unified_manifest.csv.