A full-stack fraud detection dashboard where users upload transaction CSVs (or use sample Kaggle data) and get ML + rule-based risk scoring with charts and an interactive table.
- Backend: Python + Flask + pandas + scikit-learn (Isolation Forest)
- Frontend: React (Vite) + TypeScript + Tailwind + Recharts + TanStack Table
- ML (40%): Isolation Forest anomaly detection on normalized features (Amount, Hour, V1-V5 if present).
- Rules (60%): Large outliers, micro-transactions, round numbers, late-night hours.
- Scores combine to
risk_scoreand map to Low/Medium/High/Critical; reasons are generated per row.
cd backend
python -m pip install --upgrade pip setuptools wheel
pip install -r requirements.txt
python app.pyBackend runs at http://127.0.0.1:5000.
cd frontend
npm install
npm run dev # or: npm run build && npm run previewFrontend runs at http://localhost:5173 (shown in terminal).
The "Generate Sample Data" button requires the Kaggle Credit Card Fraud dataset for best results.
Download dataset:
- Go to https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud
- Sign in (or create account)
- Click Download
- Extract the ZIP and place
creditcard.csvinbackend/data/creditcard.csv
Without the dataset: You can still upload your own CSV files with an Amount column.
GET /api/health— health checkGET /api/sample-data— returns analyzed sample (100 random Kaggle rows)POST /api/analyze— analyze CSV upload (field name:file, multipart/form-data). Limited to first 500 rows for speed.
{
"transactions": [{
"id": "T001",
"amount": 123.45,
"riskScore": 72,
"riskLevel": "High",
"reasons": ["Large amount: $123.45", "ML detected unusual pattern"]
}],
"statistics": {
"totalTransactions": 100,
"suspiciousCount": 23,
"fraudRate": 23.0,
"totalAtRisk": 50234.12
},
"riskDistribution": [{"name": "High", "value": 10, "color": "hsl(25 95% 55%)"}],
"fraudComparison": [{"name": "Analysis", "fraudulent": 23, "normal": 77}]
}- Required column:
Amount - Optional columns (improve model):
Time(seconds; converts to Hour),V1..Vn(PCA features from Kaggle credit card dataset),Class(0/1) for reference only. - Upload as CSV; or click Generate Sample Data to fetch Kaggle sample from backend.
- Dev mode uses Flask built-in server (not for production). For prod, use Gunicorn/uwsgi + reverse proxy.
- Frontend dev mode is slower;
npm run build && npm run previewis faster. - If npm is missing, install Node.js LTS.
- If pip errors on pandas/numpy, upgrade pip/setuptools/wheel (already in setup above).
backend/app.py— Flask API endpointsbackend/detector.py— ML + rule engine and response shapingfrontend/— React + Tailwind dashboard (charts, table, upload UI)
- Hover text disappearing (light mode) fixed in table header buttons.
- Background particles disabled for performance; re-enable via
ParticleBackground.tsxif desired. - Sample data randomness:
/api/sample-datasamples new fraud/normal rows each call.
- Start backend:
python app.py - Start frontend:
npm run dev - Open browser, use Generate Sample Data or upload a CSV.