PassNet

PassNet is an AI system for compiler optimization that leverages LLM-driven agents to automatically generate high-performance GPU kernels through compiler pass mechanisms for computation graph optimization. PassNet includes a complete optimization toolchain, the PassBench evaluation benchmark, and the PassAgent agent evaluation framework.

English | 中文

Links

Project Structure

PassNet/
├── pass_bench/               # PassBench compiler evaluation framework: kernel compilation, correctness verification, performance benchmarking
├── pass_agent/               # PassAgent evaluation framework
├── samples/                  # PassBench sample data
├── sample_lists/             # PassBench sample list files (eval/train splits)
├── entry_scripts/            # Evaluation entry scripts
├── graphs/                   # Subgraph data
├── graph_lists/              # Subgraph lists and grouping info
├── test/                     # Unit tests
├── Dockerfile.nvidia         # Docker image definition
└── requirements.txt          # Python dependencies

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────┐
│                             PassAgent                                   │
│                    (LLM-driven Pass Generation)                         │
│ ┌─────────────────────────────────────────────────────────────────────┐ │◄───┐
│ │  Multi-step Iterative Solving  ·  k-attempts  ·  R2E-Gym Framework  │ │    │
│ └─────────────────────────────────────────────────────────────────────┘ │    │
└────────────────┬───────────────────────────────────────┬────────────────┘    │
      read data  │                        generated pass │                     │
                 ▼                                       ▼                     │
┌───────────────────────────────────┐    ┌───────────────────────────────┐     │
│             DataSet               │    │          PassBench            │     │
│  ┌─────────────────────────────┐  │    │  ┌──────────────────────────┐ │     │
│  │ graphs/                     │  │    │  │ 1. Execution & Eval      │ │     │
│  │  sole_op  (5,939)           │  │    │  │    Eager Execution       │ │     │
│  │  fusible  (22,870)          │  │    │  │    pass_mgr Execution    │ │     │
│  │  typical  (25,151)          │  │    │  └────────────┬─────────────┘ │     │
│  └─────────────────────────────┘  │    │               │               │     │
│  ┌─────────────────────────────┐  │    │               ▼               │  feedback
│  │ samples/                    │  │    │  ┌──────────────────────────┐ │     │
│  │  sole_op  (1,029)           │  │    │  │ 2. Result Checking       │ │     │
│  │  fusible  (4,676)           │  │    │  │    Correctness & Speedup │ │     │
│  │  typical  (4,278)           │  │    │  └────────────┬─────────────┘ │     │
│  └─────────────────────────────┘  │    │               │               │     │
│  ┌─────────────────────────────┐  │    │               ▼               │     │
│  │ sample_lists/               │  │    │  ┌──────────────────────────┐ │     │
│  │  train/                     │  │    │  │ 3. Score Aggregation     │ │     │
│  │  eval/                      │  │    │  │    ES(t) & AS Met        │ │     │
│  └─────────────────────────────┘  │    │  └──────────────────────────┘ │     │
└───────────────────────────────────┘    └───────────────────────────────┘     │
                                                         └─────────────────────┘

Core Components

PassBench — Compiler Evaluation Framework

Provides kernel compilation, correctness verification, and performance benchmarking. It serves as both a standalone evaluation tool and the backend evaluation framework invoked by PassAgent:

Kernel Compilation: Executes pass matching and replacement via the pass_mgr compiler method
Correctness Verification: Validates numerical correctness of optimized kernels against dtype-specific tolerance thresholds (float32 / float16 / bfloat16)
Performance Benchmarking: Measures speedup over 100 trials and outputs aggregated_score.json
Score Aggregation: aggregate_es_scores.py computes ES(t) scores across all graphs in a sample

PassAgent — R2E-Gym Agent Evaluation Framework

Evaluates agent capabilities for compiler optimization using the R2E-Gym framework. See pass_agent/README.md for details.

DataSet

graphs — Raw Subgraph Data

Stores raw computation subgraphs extracted from deep learning models, serving as the source for PassBench samples:

fusible_subgraphs/: A small set of example fusible subgraphs (1,456), containing computation graphs with multi-operator fusion opportunities
hf_subgraphs/ (Legacy): Previous version subgraph data, containing sole op (1,410), fusible (4,167), and typical (6,157) categories
hf_subgraphs_v2/: HuggingFace model subgraphs, organized into three categories:
- sole_op_subgraphs: Single-operator subgraphs (5,939)
- fusible_subgraphs: Fusible subgraphs (22,870)
- typical_subgraphs: Typical subgraphs (25,151)

graph_lists — Subgraph Lists and Grouping

Stores subgraph path lists, UID groupings, and other information for sample filtering and group management:

Subgraph Path Lists (line format: subgraph_UID\tsubgraph_relative_path)

File	Subgraphs	Description
`fusible_subgraphs.txt`	1,455	Example fusible subgraph paths
`hf_sole_op_subgraphs.txt`	1,410	Legacy sole op subgraph paths
`hf_fusible_subgraphs.txt`	4,166	Legacy fusible subgraph paths
`hf_typical_subgraphs.txt`	6,157	Legacy typical subgraph paths
`hf_sole_op_subgraphs_v2.txt`	5,939	v2 sole op subgraph paths
`hf_fusible_subgraphs_v2.txt`	22,870	v2 fusible subgraph paths
`hf_typical_subgraphs_v2.txt`	25,151	v2 typical subgraph paths

samples — PassBench Evaluation Samples

Evaluation samples generated from graphs/, each serving as an independently executable evaluation unit:

fusible_subgraphs/: A small set of example samples from TIMM models' fusible subgraphs, organized by model_name/subgraph_index
hf_subgraphs/ (Legacy): Previous version subgraph samples, containing sole op (590), fusible (2,489), and typical (3,382) categories
hf_subgraphs_v2/: v2 subgraph samples with extended multi-dtype support, containing sole op (1,029), fusible (4,676), and typical (4,278) categories, organized by hash path xx/yy/hash/, dataset published at PassNet/PassNet

Each sample directory contains:

File	Description
`entry.sh`	Evaluation entry script that executes compilation, verification, and performance statistics
`graph_list.txt`	List of computation graphs included in the sample
`graphs/`	Computation graph definitions (model.py, weight_meta.py, etc.)
`pass_dir/`	Output directory for generated optimization passes
`pass_bench/`	Copy of the evaluation framework (for standalone execution within Docker containers)
`sample_uids.txt`	Unique sample identifier (hf_subgraphs_v2 only)

sample_lists — Eval/Train Sample Splits

Stores sample path lists for evaluation and training, organized by purpose and subgraph type, available in both txt and csv formats:

train/ (Training Set)

File	Samples	Description
`hf_sole_op_train_samples_v2.txt`	1,028	Sole op subgraph training samples
`hf_fusible_train_samples_v2.txt`	4,476	Fusible subgraph training samples
`hf_typical_train_samples_v2.txt`	4,078	Typical subgraph training samples
`hf_sole_op_train_samples.txt` (Legacy)	589	Legacy sole op subgraph training samples
`hf_fusible_train_samples.txt` (Legacy)	2,289	Legacy fusible subgraph training samples
`hf_typical_train_samples.txt` (Legacy)	3,182	Legacy typical subgraph training samples

eval/ (Evaluation Set)

File	Samples	Description
`hf_fusible_eval_samples_v2.txt`	200	Fusible subgraph evaluation samples
`hf_typical_eval_samples_v2.txt`	200	Typical subgraph evaluation samples
`hf_fusible_eval_samples.txt` (Legacy)	200	Legacy fusible subgraph evaluation samples
`hf_typical_eval_samples.txt` (Legacy)	200	Legacy typical subgraph evaluation samples

Quick Start

Requirements

Python 3.12+
PyTorch 2.9+ (CUDA 12.8)
NVIDIA GPU (CUDA support)
Docker (optional, for containerized evaluation)

Installation

cd /path/to/passnet

# Install dependencies
pip install -r requirements.txt

# Set environment variables
export PYTHONPATH=$PYTHONPATH:/path/to/passnet

Run Example

# Verify sample evaluation
bash samples/fusible_subgraphs/crossvit_15_dagger_240.in1k/crossvit_15_dagger_240.in1k_0_start14_end16_4/entry.sh

Docker Usage

Build Image

docker build . -t passnet:latest -f Dockerfile.nvidia

Verify Single Sample Execution in Container

docker run --gpus all --privileged \
    -v <path-to-passnet-project>:/workspace \
    -w /workspace \
    passnet:latest \
    bash samples/fusible_subgraphs/crossvit_15_dagger_240.in1k/crossvit_15_dagger_240.in1k_0_start14_end16_4/entry.sh

PassBench Evaluation Pipeline

The PassNet evaluation pipeline works as follows:

Analyze computation graph: Read model.py and weight_meta.py to understand the target subgraph's operators, tensor shapes, and dtypes
Generate optimization pass: LLM agent generates a pass file and places it in pass_dir/
Pass matching and replacement: pass_mgr matches the pattern in the FX graph and replaces it with the optimized kernel
Correctness verification: Compare eager and compiled outputs using dtype-specific tolerance thresholds
Performance benchmarking: Measure speedup and compute ES(t), output aggregated_score.json

# place your pass file
cp MyPass.py samples/<type>/<hash>/pass_dir/
echo '["MyPass"]' > samples/<type>/<hash>/pass_dir/sorted_output_pass_rule_names.json

# run evaluation for a single sample
bash samples/<type>/<hash>/entry.sh

See pass_bench/README.md for pass file format and batch evaluation.

PassAgent Evaluation

Evaluate agents using the PassAgent framework:

cd pass_agent
pip install -r requirements.txt

python examples/run_pass_agent_demo.py \
    --llm-name openai/glm-4.7 \
    --llm-base-url <your-llm-base-url> \
    --openai-api-key <your-api-key> \
    --dataset datasets/passbench_demo_dataset.jsonl \
    --max-steps 50 \
    --k 10

See pass_agent/README.md for details.

License

Please refer to the license file in the project root directory.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PassNet

Links

Table of Contents

Project Structure

Architecture Overview

Core Components

PassBench — Compiler Evaluation Framework

PassAgent — R2E-Gym Agent Evaluation Framework

DataSet

graphs — Raw Subgraph Data

graph_lists — Subgraph Lists and Grouping

samples — PassBench Evaluation Samples

sample_lists — Eval/Train Sample Splits

Quick Start

Requirements

Installation

Run Example

Docker Usage

Build Image

Verify Single Sample Execution in Container

PassBench Evaluation Pipeline

PassAgent Evaluation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 234 Commits
docs		docs
entry_scripts		entry_scripts
graph_lists		graph_lists
graphs		graphs
legacy_pass_agent		legacy_pass_agent
misc		misc
pass_agent		pass_agent
pass_bench		pass_bench
sample_lists		sample_lists
samples		samples
test		test
tools		tools
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile.nvidia		Dockerfile.nvidia
README.md		README.md
README_cn.md		README_cn.md
requirements.txt		requirements.txt

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

PassNet

Links

Table of Contents

Project Structure

Architecture Overview

Core Components

PassBench — Compiler Evaluation Framework

PassAgent — R2E-Gym Agent Evaluation Framework

DataSet

graphs — Raw Subgraph Data

graph_lists — Subgraph Lists and Grouping

samples — PassBench Evaluation Samples

sample_lists — Eval/Train Sample Splits

Quick Start

Requirements

Installation

Run Example

Docker Usage

Build Image

Verify Single Sample Execution in Container

PassBench Evaluation Pipeline

PassAgent Evaluation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages