🧠 Developer Stress Predictor

An end-to-end Machine Learning application that predicts developer stress levels based on work patterns, habits, and environmental factors. Built with production-grade practices including CI/CD pipelines, containerization, automated testing, and cloud deployment.

📋 Table of Contents

Overview
Features
Architecture
Model Details
Getting Started
API Reference
Deployment
Testing
Project Structure
Tech Stack
Contributing

🎯 Overview

Developer burnout is a real problem in the tech industry. This project provides a data-driven approach to predict and monitor stress levels, helping developers and teams take proactive measures before burnout occurs.

The application consists of:

REST API: A FastAPI backend serving predictions with OpenAPI documentation
Web UI: An interactive Streamlit dashboard for easy predictions and visualization
ML Model: A Random Forest Regressor trained on developer work patterns

Both services are deployed on Google Cloud Run with automatic scaling and CI/CD pipelines.

✨ Features

🔮 Prediction Engine

Predicts stress level on a scale of 0-100
Takes into account 10 different work-related factors
Provides personalized recommendations based on stress level
Supports both single and batch predictions

🌐 REST API

Full OpenAPI/Swagger documentation
API key authentication
Health checks and monitoring endpoints
Model introspection (feature importance, metrics)

📊 Interactive Dashboard

User-friendly form for inputting work patterns
Real-time stress level visualization with gauge charts
Monitoring dashboard with model metrics
Prediction history tracking

🏭 Production Ready

Dockerized services with multi-stage builds
CI pipeline with linting, type checking, and 41+ tests
CD pipeline with automatic deployment to Cloud Run
Secret management with Google Secret Manager
Auto-scaling from 0 to handle variable load

🏗️ Architecture

                                    ┌──────────────────────────────────────┐
                                    │           Google Cloud Run           │
                                    └──────────────────────────────────────┘
                                                      │
                       ┌──────────────────────────────┼──────────────────────────────┐
                       │                              │                              │
                       ▼                              ▼                              ▼
              ┌─────────────────┐          ┌─────────────────┐          ┌─────────────────┐
              │                 │          │                 │          │                 │
              │   Streamlit     │─────────▶│    FastAPI      │─────────▶│  RandomForest   │
              │   Frontend      │   HTTP   │    Backend      │          │     Model       │
              │                 │          │                 │          │                 │
              └─────────────────┘          └─────────────────┘          └─────────────────┘
                     │                            │                            │
                     │                            │                            │
              ┌──────┴──────┐              ┌──────┴──────┐              ┌──────┴──────┐
              │  Plotly     │              │  Pydantic   │              │ scikit-learn│
              │  Charts     │              │  Validation │              │  R² = 0.89  │
              └─────────────┘              └─────────────┘              └─────────────┘

Data Flow

User Input → Streamlit collects work pattern data through an interactive form
API Request → Data is validated and sent to FastAPI backend
Prediction → Random Forest model processes features and returns stress level
Visualization → Results displayed with gauge charts and recommendations

📊 Model Details

Input Features

Feature	Type	Description	Range
`Hours_Worked`	Numeric	Hours worked per day	1-24
`Sleep_Hours`	Numeric	Hours of sleep per night	1-12
`Bugs`	Numeric	Number of bugs to fix	0-50+
`Deadline_Days`	Numeric	Days until deadline	0-60+
`Coffee_Cups`	Numeric	Daily coffee consumption	0-20
`Meetings`	Numeric	Number of daily meetings	0-24
`Interruptions`	Numeric	Daily interruptions count	0-50
`Experience_Years`	Categorical	Developer experience level	Junior / Mid / Senior
`Code_Complexity`	Categorical	Project complexity	Low / Medium / High
`Remote_Work`	Categorical	Remote work status	Yes / No

Performance Metrics

Metric	Train	Test
R² Score	0.92	0.89
RMSE	4.1	5.2
MAE	3.2	4.1

Feature Importance

The top factors influencing stress prediction:

😴 Sleep Hours - Most significant predictor
⏰ Hours Worked - Strong correlation with stress
🔔 Interruptions - Frequent interruptions increase stress
🐛 Bugs - Technical debt impact
📅 Deadline Days - Time pressure effects

🚀 Getting Started

Prerequisites

Python 3.11+
Docker (optional, for containerized deployment)
Google Cloud account (optional, for cloud deployment)

Local Installation

# Clone the repository
git clone https://github.com/Acquarts/developer-stress-predictor-ml-production.git
cd developer-stress-predictor-ml-production

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set environment variables
cp .env.example .env
# Edit .env with your configuration

Running Locally

Option 1: Run services separately

# Terminal 1 - Start the API
uvicorn src.api.main:app --reload --port 8000

# Terminal 2 - Start Streamlit
streamlit run streamlit_app/app.py

Option 2: Use Docker Compose

docker-compose up --build

# Services available at:
# API:       http://localhost:8000
# API Docs:  http://localhost:8000/docs
# Streamlit: http://localhost:8501

Training a New Model

python scripts/train_model.py

This will train a new model on the data in data/developer_stress.csv and save it to models/stress_model.joblib.

📡 API Reference

Base URL

Local: http://localhost:8000
Production: https://stress-api-562289298058.us-central1.run.app

Authentication

All prediction endpoints require an API key in the header:

X-API-Key: your-api-key

Endpoints

Health Check

GET /health

Returns service health status and model loading state.

Response:

{
  "status": "healthy",
  "model_loaded": true,
  "version": "1.0.0"
}

Single Prediction

POST /predict

Request Body:

{
  "Hours_Worked": 10,
  "Sleep_Hours": 6,
  "Bugs": 15,
  "Deadline_Days": 7,
  "Coffee_Cups": 4,
  "Meetings": 3,
  "Interruptions": 5,
  "Experience_Years": "Mid",
  "Code_Complexity": "Medium",
  "Remote_Work": "Yes"
}

Response:

{
  "stress_level": 67.5,
  "warnings": ["Consider taking breaks - stress level is elevated"]
}

Batch Prediction

POST /predict/batch

Request Body:

{
  "predictions": [
    { "Hours_Worked": 8, "Sleep_Hours": 7, ... },
    { "Hours_Worked": 12, "Sleep_Hours": 5, ... }
  ]
}

Model Information

GET /model/info

Returns model metadata including type, parameters, and training metrics.

Feature Importance

GET /model/features

Returns feature importance scores from the trained model.

🚢 Deployment

Google Cloud Run

The project includes GitHub Actions workflows for automatic deployment:

CI Pipeline (.github/workflows/ci.yml)
- Runs on every push and PR
- Linting with Ruff
- Type checking with MyPy
- Unit and integration tests with Pytest
- Security scanning with Bandit
CD Pipeline (.github/workflows/cd.yml)
- Triggers on push to main
- Builds Docker images
- Pushes to Google Artifact Registry
- Deploys to Cloud Run
- Runs smoke tests

Required GitHub Secrets

Secret	Description
`GCP_PROJECT_ID`	Google Cloud project ID
`GCP_WORKLOAD_IDENTITY_PROVIDER`	Workload Identity Federation provider
`GCP_SERVICE_ACCOUNT`	Service account email for deployment

Manual Deployment

# Build and push API image
docker build -f infrastructure/Dockerfile -t stress-api .
docker push gcr.io/YOUR_PROJECT/stress-api

# Build and push Streamlit image
docker build -f infrastructure/Dockerfile.streamlit -t stress-streamlit .
docker push gcr.io/YOUR_PROJECT/stress-streamlit

# Deploy to Cloud Run
gcloud run deploy stress-api --image gcr.io/YOUR_PROJECT/stress-api --region us-central1
gcloud run deploy stress-streamlit --image gcr.io/YOUR_PROJECT/stress-streamlit --region us-central1

🧪 Testing

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=src --cov-report=html

# Run specific test file
pytest tests/unit/test_api.py

# Run with verbose output
pytest -v

Test Structure

tests/
├── conftest.py           # Shared fixtures
├── unit/
│   ├── test_api.py       # API endpoint tests
│   ├── test_predictor.py # Model prediction tests
│   └── test_preprocessor.py # Data preprocessing tests
└── integration/
    └── test_pipeline.py  # End-to-end tests

Coverage

The project maintains 80%+ test coverage across all modules.

📁 Project Structure

developer-stress-predictor/
│
├── 📂 .github/
│   └── workflows/
│       ├── ci.yml              # Continuous Integration
│       └── cd.yml              # Continuous Deployment
│
├── 📂 src/
│   ├── __init__.py
│   ├── config.py               # Application configuration
│   ├── 📂 api/
│   │   ├── __init__.py
│   │   ├── main.py             # FastAPI application
│   │   ├── schemas.py          # Pydantic models
│   │   └── dependencies.py     # Dependency injection
│   ├── 📂 data/
│   │   ├── __init__.py
│   │   └── preprocessor.py     # Data transformations
│   ├── 📂 models/
│   │   ├── __init__.py
│   │   ├── trainer.py          # Model training
│   │   └── predictor.py        # Model inference
│   └── 📂 monitoring/
│       ├── __init__.py
│       └── metrics.py          # Prometheus metrics
│
├── 📂 streamlit_app/
│   ├── app.py                  # Main Streamlit app
│   ├── utils.py                # Utility functions
│   └── 📂 components/
│       ├── __init__.py
│       ├── prediction_form.py  # Input form component
│       ├── results_display.py  # Results visualization
│       └── monitoring_dashboard.py  # Monitoring UI
│
├── 📂 tests/
│   ├── conftest.py             # Test fixtures
│   ├── 📂 unit/
│   └── 📂 integration/
│
├── 📂 infrastructure/
│   ├── Dockerfile              # API container
│   ├── Dockerfile.streamlit    # Streamlit container
│   └── docker-compose.yml      # Local development
│
├── 📂 models/
│   └── stress_model.joblib     # Trained model
│
├── 📂 data/
│   └── developer_stress.csv    # Training data
│
├── 📂 scripts/
│   └── train_model.py          # Training script
│
├── 📂 notebooks/
│   └── developer_stress.ipynb  # Exploratory analysis
│
├── .env.example                # Environment template
├── .gitignore
├── pyproject.toml              # Project configuration
├── requirements.txt            # Dependencies
└── README.md

🛠️ Tech Stack

Machine Learning

Technology	Purpose
	Model training & inference
	Data manipulation
	Numerical computing
	Model serialization

Backend

Technology	Purpose
	REST API framework
	Data validation
	ASGI server

Frontend

Technology	Purpose
	Web application
	Interactive charts

Infrastructure

Technology	Purpose
	Containerization
	Serverless deployment
	CI/CD pipelines

Quality Assurance

Technology	Purpose
	Testing framework
	Linting
	Type checking
	Security scanning

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Made with ☕ and 🧠 by Acquarts

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
data		data
infrastructure		infrastructure
models		models
notebooks		notebooks
scripts		scripts
src		src
streamlit_app		streamlit_app
tests		tests
training		training
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🧠 Developer Stress Predictor

📋 Table of Contents

🎯 Overview

✨ Features

🔮 Prediction Engine

🌐 REST API

📊 Interactive Dashboard

🏭 Production Ready

🏗️ Architecture

Data Flow

📊 Model Details

Input Features

Performance Metrics

Feature Importance

🚀 Getting Started

Prerequisites

Local Installation

Running Locally

Training a New Model

📡 API Reference

Base URL

Authentication

Endpoints

Health Check

Single Prediction

Batch Prediction

Model Information

Feature Importance

🚢 Deployment

Google Cloud Run

Required GitHub Secrets

Manual Deployment

🧪 Testing

Running Tests

Test Structure

Coverage

📁 Project Structure

🛠️ Tech Stack

Machine Learning

Backend

Frontend

Infrastructure

Quality Assurance

🤝 Contributing

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages