- End-to-End Latency Observability: Tracks "Tick-to-Signal" latency from the moment a record is read until the alpha signal is generated, using high-resolution
hdrhistogram. - CPU Affinity & Pinning: Automatically pins pipeline stage workers to dedicated physical cores to minimize OS scheduling jitter and cache misses.
- Zero-Allocation Hot Path: All data models are
Pod(Plain Old Data) and cache-line aligned (#[repr(align(64))]) to prevent false sharing. Stages usefxhashfor ultra-fast internal state management. - Accurate TTS Metrics: Synchronized time measurement and warm-up stabilization ensure reported Tick-to-Signal latencies represent steady-state production performance.
- SIMD-Friendly Signal Calculation: Alpha signals (Weighted Order Book Imbalance) are calculated using vectorized loops that the compiler can easily optimize for SIMD instructions.
- Real-Time Simulation: Supports a
--simulate-livemode to replay historical data at its original exchange-timestamp speed, allowing for realistic system testing. - High Throughput: Capable of processing over 4M+ events per second (MEPS) on a single core.
The system uses a multi-stage threaded pipeline where data flows through wait-free journals. The full implementation can be found in main.rs.
graph LR
A[DBN File] -->|Decoding| B(Stage 1: Importer)
B -->|MBO Entry| C(Stage 2: Order Tracker)
C -->|MBO Delta| D(Stage 3: Price Aggregator)
D -->|Price Level| E(Stage 4: Alpha Gen)
E -->|Signal| F[Strategy/Log]
subgraph "Thread per Stage (CPU Pinned)"
C
D
E
end
- Normalization (
LightMboEntry): Compact MBO record withts_recvtagging. - Order Tracking (
MboDelta): Captures the change in volume at a specific price point. - Aggregation (
BookLevelEntry): Maintains total volume per price level. - Book State (
BookLevelTop): Top-5 price levels, maintained within the Alpha Gen stage. - Signal (
ImbalanceSignal): The final alpha output with end-to-end latency metadata.
# High-speed backtest (Maximum throughput)
cargo run --release --example databento_replay -- --file path/to/data.dbn --pin-cores
# Live simulation (Real-time speed)
cargo run --release --example databento_replay -- --file path/to/data.dbn --simulate-liveThe engine reports:
- MEPS: Millions of Events Per Second processed.
- P99.9 Latency: Tail latency for both stage execution and end-to-end signal generation.
- Throughput Stats: Periodic logs showing the processing rate and average speed.
Based on the latest benchmarks (perf.log), on a typical performance-tuned environment (--pin-cores), the system achieves:
Final Imbalance Signals: 24,191,908
Throughput: 4.05 MEPS (Million Events Per Second)
Execution Time: 5.97s
TTS Latency (Tick-to-Signal): p50=2.9us, p90=6.1us, p99=34.3us, p999=208.4us, p9999=1.47ms
Architectural Stats:
- IPC (Instructions Per Cycle): 0.74
- Branch Misprediction: 2.67%
- L1 Data Cache Miss: 1.94%