Skip to content

pizofreude/outlier-detection

Repository files navigation

Production-Ready Outlier Detection System

Python License Status

A comprehensive framework for robust outlier detection with uncertainty estimation, designed specifically for business intelligence applications.

🎯 Overview

This system combines multiple detection methods with uncertainty quantification to provide reliable anomaly detection for production environments. It includes statistical baselines, model-based detectors, density-based methods, and intelligent score fusion.

πŸ““ Notebooks

detection_bi_domain.ipynb - Production BI System

The main production-ready notebook with comprehensive business intelligence features:

  • Multi-method outlier detection (Statistical, Model-based, Density-based)
  • Score fusion and uncertainty estimation
  • Complete benchmarking and evaluation framework
  • Unit testing and production recommendations
  • BI-focused deployment guidance

detection_basics.ipynb - Educational Tutorial

Foundational notebook demonstrating core concepts:

  • Logistic regression baseline and overconfidence issues
  • Temperature scaling and calibration techniques
  • Bayesian logistic regression for uncertainty quantification
  • Educational examples and visualizations

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • Jupyter Notebook environment
  • Required packages: numpy, pandas, scikit-learn, matplotlib, seaborn, scipy, joblib

Installation

  1. Clone or download the repository
  2. Install dependencies: pip install -r requirements.txt
  3. Start with detection_basics.ipynb for concepts, then detection_bi_domain.ipynb for production

Running the Notebooks

For Production System:

  1. Open detection_bi_domain.ipynb
  2. Execute cells sequentially from top to bottom
  3. Results will be saved to the results/ directory
  4. Models and artifacts will be saved to the artifacts/ directory

For Learning:

  1. Start with detection_basics.ipynb to understand fundamentals
  2. Learn about overconfidence, calibration, and Bayesian approaches
  3. Then proceed to the full production system

πŸ“Š System Architecture

Detection Methods Included

  • Statistical Baselines: Z-score and IQR-based detection
  • Model-Based: Isolation Forest and Local Outlier Factor (LOF)
  • Density-Based: Kernel Density Estimation (KDE) and Gaussian Mixture Models (GMM)
  • Score Fusion: Weighted combination of all detector outputs

Key Features

  • Reproducible Results: Fixed seeds and data integrity verification
  • Comprehensive Evaluation: AUROC, AUPRC, FPR@95TPR, ECE metrics
  • Production Ready: Unit tests, monitoring, and deployment guidelines
  • BI-Focused: Specific recommendations for business intelligence use cases

πŸ“ Directory Structure

outlier-detection/
β”œβ”€β”€ detection_bi_domain.ipynb      # Main production BI system
β”œβ”€β”€ detection_basics.ipynb         # Educational tutorial notebook
β”œβ”€β”€ artifacts/                     # Saved models and data (included in .gitignore)
β”œβ”€β”€ results/                       # Evaluation results and reports
β”œβ”€β”€ tests/                         # Unit tests
β”œβ”€β”€ README.md                      # This file
β”œβ”€β”€ CHANGELOG.md                   # Project change history
└── requirements.txt               # Python dependencies

πŸ”¬ Methodology

1. Data Generation

  • Synthetic 2D dataset with clear IND/OOD separation
  • Two interlocking half-circles (moons) for in-distribution data
  • Gaussian cluster for out-of-distribution data
  • Data integrity verification with SHA256 hashing

2. Feature Engineering

  • Standardization pipeline with train/validation/test splits
  • Prevents data leakage and ensures proper scaling
  • Configurable scaling methods (Standard, Robust, MinMax)

3. Multi-Method Detection

  • Statistical methods for baseline comparison
  • Advanced ML models for complex pattern detection
  • Density-based approaches for likelihood estimation
  • Hyperparameter optimization where applicable

4. Score Fusion

  • Intelligent combination of detector outputs
  • Weight optimization using validation data
  • Normalized score scaling for fair comparison
  • Grid search for optimal fusion parameters

5. Comprehensive Evaluation

  • Multiple metrics for thorough assessment
  • Calibration analysis for reliability
  • Benchmark comparison across all methods
  • Production-ready performance reporting

πŸ“ˆ Results and Benchmarking

Results are automatically generated and saved to:

  • results/benchmarks.csv - Detailed performance metrics
  • results/data_summary.csv - Dataset statistics
  • results/environment_info.json - Reproducibility information
  • results/production_recommendations.json - Deployment guidance

Key Metrics Tracked

  • AUROC: Area Under ROC Curve
  • AUPRC: Area Under Precision-Recall Curve
  • FPR@95TPR: False Positive Rate at 95% True Positive Rate
  • ECE: Expected Calibration Error

🏭 Production Deployment

Recommended Approach

  1. Start Conservative: Use 95th percentile thresholds initially
  2. Monitor Closely: Track false positive rates and business impact
  3. Human-in-the-Loop: Review high uncertainty cases manually
  4. Regular Maintenance: Monthly threshold tuning and model updates

Alert Configuration

  • LOW: 75th percentile threshold, 24h review SLA
  • MEDIUM: 90th percentile threshold, 4h review SLA
  • HIGH: 95th percentile + uncertainty, immediate review
  • CRITICAL: 99th percentile, immediate escalation

Monitoring Strategy

  • Daily anomaly count tracking
  • Feature importance analysis
  • Model drift detection
  • Performance degradation alerts

πŸ§ͺ Testing

Unit Tests

Run the built-in unit tests:

# Tests are automatically executed in the notebook
# Or run separately: python -m pytest tests/test_outlier_detection.py -v

Integration Testing

The notebook includes a complete integration test that verifies:

  • Data generation consistency
  • Model training pipeline
  • Score fusion functionality
  • Evaluation metrics calculation

πŸ”§ Customization

Adding New Detectors

  1. Implement detector following sklearn-compatible API
  2. Add to evaluation pipeline in benchmarking section
  3. Update score fusion system to include new method
  4. Add unit tests for new functionality

Modifying Thresholds

  • Adjust contamination parameters in detector initialization
  • Update alert severity levels in production recommendations
  • Retrain fusion weights with new threshold preferences

Custom Data

  • Replace synthetic data generation with your data loading code
  • Ensure proper train/validation/test splits
  • Update feature engineering pipeline as needed
  • Verify data integrity and tracking

πŸ“š References and Further Reading

🀝 Contributing

  1. Add new detection methods or improvements
  2. Enhance evaluation metrics
  3. Improve production deployment tools
  4. Expand unit test coverage
  5. Add real-world use case examples

πŸ“„ License

This project is provided as-is for educational and commercial use under the Apache License 2.0.

πŸ“ž Support

For questions or issues:

  1. Check the notebook comments and documentation
  2. Review the unit tests for usage examples
  3. Consult the production recommendations for deployment guidance
  4. Examine the results files for performance insights

Last updated: September 2025 Version: 1.0