🔥 Furnace

Blazingly fast ML inference server powered by Rust and Burn framework

A high-performance, lightweight HTTP inference server that serves machine learning models with zero Python dependencies. Built with Rust for maximum performance and supports ONNX models including ResNet-18 for image classification.

✨ Features

🦀 Pure Rust: Maximum performance, minimal memory footprint (2.3MB binary)
🔥 ONNX Support: Direct ONNX model loading with automatic shape detection
⚡ Fast Inference: ~4ms inference times for ResNet-18
🛡️ Production Ready: Graceful shutdown, comprehensive error handling
🌐 HTTP API: RESTful endpoints with CORS support
📦 Single Binary: Zero external dependencies
🖼️ Image Classification: Optimized for computer vision models

🚀 Quick Start

1. Clone and Build

git clone https://github.com/yourusername/furnace.git
cd furnace
cargo build --release

2. Download ResNet-18 Model

# Download ResNet-18 ONNX model (45MB)
curl -L "https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet18-v1-7.onnx" -o resnet18.onnx

3. Start the Server

./target/release/furnace --model-path resnet18.onnx --host 127.0.0.1 --port 3000

4. Generate Test Data

# Generate ResNet-18 test samples (creates JSON files locally)
cargo run --example resnet18_sample_data

This creates the following test files:

resnet18_single_sample.json - Single image test data
resnet18_batch_sample.json - Batch of 3 images test data
resnet18_full_test.json - Full-size single image (150,528 values)

5. Test the API

# Health check
curl http://localhost:3000/healthz

# Model info
curl http://localhost:3000/model/info

# Single image prediction
curl -X POST http://localhost:3000/predict \
  -H "Content-Type: application/json" \
  --data-binary @resnet18_full_test.json

# Batch prediction
curl -X POST http://localhost:3000/predict \
  -H "Content-Type: application/json" \
  --data-binary @resnet18_batch_sample.json

🖼️ Supported Models

Furnace supports ONNX models with automatic shape detection. Currently optimized for image classification models.

🎯 Tested Models

Model	Input Shape	Output Shape	Size	Status
ResNet-18	`[3, 224, 224]`	`[1000]`	45MB	✅ Supported
MobileNet v2	`[3, 224, 224]`	`[1000]`	14MB	🧪 Testing
SqueezeNet	`[3, 224, 224]`	`[1000]`	5MB	🧪 Testing

📥 Download Pre-trained Models

# ResNet-18 (ImageNet classification) - Recommended
curl -L "https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet18-v1-7.onnx" -o resnet18.onnx

# MobileNet v2 (lightweight, mobile-friendly)
curl -L "https://github.com/onnx/models/raw/main/validated/vision/classification/mobilenet/model/mobilenetv2-12.onnx" -o mobilenetv2.onnx

# SqueezeNet (very lightweight)
curl -L "https://github.com/onnx/models/raw/main/validated/vision/classification/squeezenet/model/squeezenet1.0-12.onnx" -o squeezenet.onnx

🔧 Custom Models

To use your own ONNX models:

Export your model to ONNX format
Ensure input shape compatibility (currently optimized for image classification)
Test with Furnace using the same API endpoints

# Example: Export PyTorch model to ONNX
import torch
import torchvision.models as models

model = models.resnet18(pretrained=True)
model.eval()

dummy_input = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, dummy_input, "my_model.onnx")

📊 Performance

ResNet-18 Benchmarks

Metric	Value
Binary Size	2.3MB
Model Size	45MB
Inference Time	~4ms
Memory Usage	<200MB
Startup Time	<2s
Input Size	150,528 values
Output Size	1,000 classes

🚀 Benchmark Results

Prerequisites:

# 1. Download ResNet-18 model (if not already done)
curl -L "https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet18-v1-7.onnx" -o resnet18.onnx

# 2. Generate test data (benchmarks use dynamic model detection)
cargo run --example resnet18_sample_data

Run benchmarks:

# Run all benchmarks
cargo bench

# Run specific benchmark
cargo bench single_inference
cargo bench batch_inference
cargo bench latency_measurement

📈 Performance Characteristics

Single Inference: ~4ms per image (ResNet-18)
Batch Processing: Optimized for batches of 1-8 images
Concurrent Requests: Handles multiple simultaneous requests
Memory Efficiency: Minimal memory allocation per request
Throughput: Scales with available CPU cores

🌐 API Endpoints

`GET /healthz`

Health check endpoint

{
  "status": "healthy",
  "model_loaded": true,
  "uptime_seconds": 3600,
  "timestamp": "2024-01-01T12:00:00Z"
}

`GET /model/info`

Model metadata and statistics

{
  "model_info": {
    "name": "resnet18",
    "input_spec": {"shape": [3, 224, 224], "dtype": "float32"},
    "output_spec": {"shape": [1000], "dtype": "float32"},
    "model_type": "burn",
    "backend": "onnx"
  },
  "stats": {
    "inference_count": 42,
    "total_inference_time_ms": 168.0,
    "average_inference_time_ms": 4.0
  }
}

`POST /predict`

Run inference on input data

Single Image:

curl -X POST http://localhost:3000/predict \
  -H "Content-Type: application/json" \
  --data-binary @resnet18_full_test.json

Batch Images:

curl -X POST http://localhost:3000/predict \
  -H "Content-Type: application/json" \
  --data-binary @resnet18_batch_sample.json

Response:

{
  "output": [0.1, 0.05, 0.02, ...], // 1000 ImageNet class probabilities
  "status": "success",
  "inference_time_ms": 4.0,
  "timestamp": "2024-01-01T12:00:00Z",
  "batch_size": 1
}

📝 Input Format

ResNet-18 expects normalized RGB image data:

Shape: [3, 224, 224] (150,528 values)
Format: Flattened array of float32 values
Range: Typically 0.0 to 1.0 (normalized pixel values)
Order: Channel-first (RGB channels, then height, then width)

�️ iDevelopment

Prerequisites

Rust 1.70+
Cargo

Build

cargo build --release

Test

cargo test

Create Custom Models

Implement the BurnModel trait in src/burn_model.rs to add support for your own model architectures.

🏗️ Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   CLI Layer     │───▶│  Model Layer    │───▶│   API Layer     │
│                 │    │                 │    │                 │
│ - Argument      │    │ - Model Loading │    │ - HTTP Routes   │
│   Parsing       │    │ - Inference     │    │ - Request       │
│ - Validation    │    │ - Metadata      │    │   Handling      │
│ - Logging Setup │    │ - Error Handling│    │ - CORS          │
└─────────────────┘    └─────────────────┘    └─────────────────┘

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Burn - The native Rust ML framework
Axum - Web framework for Rust
Tokio - Async runtime for Rust

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
.github/workflows		.github/workflows
.kiro/specs/burn-inference-server		.kiro/specs/burn-inference-server
benches		benches
examples		examples
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DEVELOPMENT.md		DEVELOPMENT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
default.nix		default.nix
deny.toml		deny.toml
flake.nix		flake.nix
sample_model.json		sample_model.json
sample_model.mpk		sample_model.mpk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔥 Furnace

✨ Features

🚀 Quick Start

1. Clone and Build

2. Download ResNet-18 Model

3. Start the Server

4. Generate Test Data

5. Test the API

🖼️ Supported Models

🎯 Tested Models

📥 Download Pre-trained Models

🔧 Custom Models

📊 Performance

ResNet-18 Benchmarks

🚀 Benchmark Results

📈 Performance Characteristics

🌐 API Endpoints

`GET /healthz`

`GET /model/info`

`POST /predict`

📝 Input Format

�️ iDevelopment

Prerequisites

Build

Test

Create Custom Models

🏗️ Architecture

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

License

lineCode/furnace

Folders and files

Latest commit

History

Repository files navigation

🔥 Furnace

✨ Features

🚀 Quick Start

1. Clone and Build

2. Download ResNet-18 Model

3. Start the Server

4. Generate Test Data

5. Test the API

🖼️ Supported Models

🎯 Tested Models

📥 Download Pre-trained Models

🔧 Custom Models

📊 Performance

ResNet-18 Benchmarks

🚀 Benchmark Results

📈 Performance Characteristics

🌐 API Endpoints

GET /healthz

GET /model/info

POST /predict

📝 Input Format

�️ iDevelopment

Prerequisites

Build

Test

Create Custom Models

🏗️ Architecture

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`GET /healthz`

`GET /model/info`

`POST /predict`

Packages