Automated DRAM address mapping reverse engineering for x86_64, ARMv8, and POWER9/10. Preprint available at https://arxiv.org/abs/2509.19568
Knock-Knock provides two workflows for DRAM reverse engineering:
-
Automatic Pipeline (
main --full-analysis): Complete C++ implementation that automatically detects thresholds, discovers bank masks, and derives row masks in one run. (Disclaimer : the automatic threshold detection might not always be accurate. In case of issues, please double-check by using plot_histogram.py) -
Manual Pipeline (
main --timing+full_analysis.py): Two-stage process where C++ collects timing data, then Python performs offline analysis with full control over parameters.
Platform Support: x86_64, ARMv8 (AArch64), POWER9/10
Core Techniques:
- Timing-based row buffer conflict detection
- GF(2) nullspace analysis for mask discovery
- Minimal-weight basis optimization
- Automated threshold detection using histogram analysis
makeThis produces main (automatic pipeline) and supporting components.
cd enable_arm_pmu
make
sudo ./load-module./setup_python_env.sh # Automated setup
source venv/bin/activate
# Or install manually:
pip3 install numpy pandas matplotlib galois scipy
python3 check_python_deps.pyThe automatic pipeline runs all three stages in C++ with a single command. It performs threshold detection, bank mask inference, and row mask derivation automatically.
sudo ./main --full-analysis -p 50This runs the complete analysis using 50% of system memory.
Stage 1: Threshold Detection
- Collects random memory access latencies
- Builds histogram and applies smoothing
- Uses "Find the Bump's Left Foot" heuristic to separate hits from conflicts
- Outputs:
latencies.dat,smoothed_histogram.dat,analysis_points.dat
Stage 2: Bank Mask Discovery
- Collects conflict samples (same-bank pairs)
- Performs GF(2) nullspace analysis with subsampling
- Finds minimal-weight basis of candidate masks
- Evaluates accuracy with confusion matrix
Stage 3: Row Mask Derivation
- Collects same-bank pairs with known timing behavior
- Applies invariance filtering (bits constant in hits, varying in conflicts)
- Derives row masks from bank mask nullspace
- Validates against collected samples
-m <MB>: Memory size in megabytes (default: 25600 MB / 25 GB)-p <percent>: Portion of system DRAM to allocate (overrides-m)-d, --delay <us>: Inter-pair delay in microseconds to avoid tFAW violations (default: 0)
--threshold <cycles>: Manually set threshold, skip auto-detection (default: auto)- Example:
--threshold 260
- Example:
--threshold-samples <n>: Samples for histogram (default: 100000)- Example:
--threshold-samples 200000
- Example:
--bank-conflicts <n>: Target conflict samples (default: 5000)--bank-measurements <n>: Max measurements (default: 3000000)--bank-subsample <n>: Subsample size for nullspace (default: 1000)--bank-rounds <n>: Subsampling iterations (default: 35)--bank-attempts <n>: Max retry attempts (default: 3)
--row-pairs <n>: Target same-bank pairs (default: 8000)--row-min-hits <n>: Minimum hit samples (default: 1000)--row-min-conflicts <n>: Minimum conflict samples (default: 1000)--row-max-attempts <n>: Max sampling attempts (default: 2400000)
-v: Verbose output-d, --delay <us>: Delay between measurements (µs) for tFAW mitigation (default: 0)--force-multiple-rounds: Force multiple rounds for validation
sudo ./main --full-analysis -p 25sudo ./main --full-analysis -p 50 \
--bank-conflicts 10000 --bank-rounds 60 --row-pairs 15000sudo ./main --full-analysis --threshold 260 -p 25sudo ./main --full-analysis -p 50 \
--threshold-samples 200000 \
--bank-conflicts 15000 --bank-subsample 2000 --bank-rounds 80 \
--row-pairs 20000 --row-min-hits 3000sudo ./main --full-analysis -p 50 --delay 10latencies.dat: Raw latency samples for threshold analysissmoothed_histogram.dat: Histogram with smoothing appliedanalysis_points.dat: Auto-detected threshold, peaks, confidence scorethreshold_analysis_summary.txt: Detailed threshold detection report
The manual pipeline separates data collection (C++) from analysis (Python), giving you full control over the analysis parameters and allowing iterative experimentation.
Use the legacy timing measurement mode to generate CSV files:
sudo ./main --timing -a -p 50 -n 100000 -r 50-a: Auto-name output asdata/<hostname>_<mem>.csv-o <file>: Specify custom output file-p <percent>: Memory allocation (% of system RAM)-m <MB>: Memory allocation (absolute MB, overridden by-p)-n <count>: Number of address pairs to measure (default: 100000)-r <rounds>: Timing rounds per pair for median calculation (default: 50)-d <us>: Inter-measurement delay in microseconds
sudo ./main --bitflip -p 50 -r 50Systematically flips bits in physical addresses to probe bank mapping functions.
python3 full_analysis.py data/<hostname>_<mem>.csv --thresh <value> [options]--thresh <cycles>: Latency threshold separating hits from conflicts- Determine this by examining the timing distribution
- Typical values: 150-300 cycles depending on system
-
--subsample <n>: Subsample size for GF(2) nullspace (default: 1000)- Controls memory usage and convergence speed
- Larger = slower but potentially more accurate
-
--repeat <n>: Number of subsampling rounds (default: 50)- More rounds = better mask frequency statistics
- Increase for difficult systems (60-100)
-
--sensitivity <float>: Sensitivity for row bit analysis (default: 0.05)- Range: 0.0 to 1.0
- Lower = stricter filtering
-
--limit <n>: Limit pairs processed (for testing)- Useful for quick validation runs
-
--verbose, -v: Detailed progress output
The script performs:
- Data loading with consistency filtering
- Binary difference matrix construction
- GF(2) nullspace analysis with subsampling
- Minimal-weight basis optimization
- Accuracy evaluation (precision, recall, F1 score)
- Bank-separated timing analysis
- Row buffer analysis with invariant detection
After either workflow, visualize threshold detection:
python3 plot_histogram.py [--interactive] [--save output.png]Options:
--interactive: Click to select manual threshold--save <file>: Save plot instead of displaying
a1,a2,elapsed_cycles,v_a1,v_a2
1a2b3c4d,5e6f7890,234,7f8e9d0c,1b2a3948a1,a2: Physical addresses (hex)elapsed_cycles: Access latency in CPU cyclesv_a1,v_a2: Virtual addresses (hex)
- Cache eviction:
clflushinstruction - Timing:
rdtsc/rdtscptimestamp counter - Barriers:
mfence,lfence
- Cache eviction:
DC CIVAC(clean & invalidate) - Timing:
PMCCNTR_EL0cycle counter (requires PMU kernel module) - Barriers:
DSB,ISB - Anti-speculation: Dependency chains to prevent speculative loads
- Cache eviction:
dcbf(data cache block flush) - Timing: Time-base register
- Barriers:
sync,isync
Bank masks fail to converge:
- Increase conflict samples:
--bank-conflicts 10000 - More subsampling rounds:
--bank-rounds 80 - Larger subsample:
--bank-subsample 2000 - Verify threshold:
python3 plot_histogram.py
Row masks not found:
- Increase same-bank pairs:
--row-pairs 15000 - Raise minimums:
--row-min-hits 2000 - More attempts:
--row-max-attempts 3000000
Threshold detection fails:
- Manual threshold:
--threshold <value> - More samples:
--threshold-samples 200000 - Visualize distribution:
python3 plot_histogram.py
Determining threshold:
- Generate histogram:
python3 plot_histogram.py --interactive - Look for the "left foot" where high-latency bump begins
- Choose value in the valley between two modes
Low accuracy (<70%):
- Verify threshold is correct
- Increase Python analysis rounds:
--repeat 100 - Larger subsample:
--subsample 2000 - Collect more data with longer C++ runs
Python analysis tips:
- Start with
--verboseto see detailed progress - Use
--limit 50000for quick testing - Increase
--repeatif masks are inconsistent - Check consistency filtering output
System-specific:
- High contention: Add
--delay 10 - Noisy timings: Double all sample counts
- Slow systems: Reduce memory
-p 10 - Ensure system is idle during measurements
- Linux with root access (for
/proc/self/pagemapand PMU control) - GCC with C++11 support, GNU make
- Python 3.8+ (optional, for analytics)
Issues and PRs welcome. Please document your hardware platform, kernel version, and parameter configuration. Keep patches focused and well-tested.
If you use this tool in your research, please cite the Knock-Knock paper and let us know about your findings.
@inproceedings{plin2026knockknock,
title = {Knock-Knock: Black-Box, Platform-Agnostic DRAM Address-Mapping Reverse Engineering},
author = {Antoine Plin and Lorenzo Casalino and Thomas Rokicki and Ruben Salvador},
booktitle = {2026: Proceedings of the Microarchitecture Security Conference (uASC '26)},
series = {Proceedings of the Microarchitecture Security Conference},
publisher = {Ruhr University Bochum},
address = {Leuven, Belgium},
year = {2026},
month = feb,
day = {3},
}