A secure, localhost-only infrastructure for LLM development and experimentation, featuring automated setup and management of MLflow for experiment tracking and Forgejo for version control and CI/CD. Includes convenient shell integration with enhanced prompts showing git branches, Python environments, and job status.
This repository provides scripts and configuration for managing a local LLM development environment with:
- MLflow: Experiment tracking, model versioning, and artifact management
- Forgejo: Self-hosted Git service with CI/CD capabilities
- Private & Secure: All services run locally with no external dependencies
- Security First: Localhost-only access, secure configurations, and proper file permissions
The stack creates a self-contained development environment:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β LLM DevStack β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Forgejo (localhost:3000) MLflow (localhost:5000) β
β ββ Git repositories ββ Experiment tracking β
β ββ CI/CD pipelines ββ Model registry β
β ββ Issue tracking ββ Artifact storage β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Local File System β
β ββ SQLite databases β
β ββ Git repositories β
β ββ ML artifacts β
β ββ Configuration files β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- macOS or Linux
- Python 3.7+ for MLflow virtual environment
- curl for downloading Forgejo binary
- openssl for generating security keys
- Homebrew (macOS only) - for Forgejo installation
-
Clone the repository:
git clone <repository-url> cd llmops_devstack
-
Configure installation paths:
cp config.env.example config.env # Edit config.env with your preferred installation directories (optional)
-
Run the setup script:
./scripts/setup.sh
-
Configure shell (optional):
./scripts/configure_shell.sh source ~/.bashrc
Adds convenient aliases and enhanced prompt - see Shell Integration for details
-
Start the services:
./scripts/start_services.sh
-
Access the services:
- Forgejo: http://localhost:3000 (or displayed port)
- MLflow: http://localhost:5000 (or displayed port)
The system uses sensible defaults that work out-of-the-box:
- Installation Directory:
$HOME/llmops_services
- Network Binding:
127.0.0.1
(localhost only) - Default Ports: 3000 (Forgejo), 5000 (MLflow)
- Auto Port Detection: Finds available ports if defaults are busy
Edit config.env
to customize:
# Base installation directory
BASE_DIR="$HOME/llmops_services"
# Network security
FORGEJO_SERVER_HOST="127.0.0.1" # Localhost only
MLFLOW_SERVER_HOST="127.0.0.1" # Localhost only
# Service timeouts
GRACEFUL_SHUTDOWN_TIMEOUT=10 # Seconds
STATUS_CHECK_TIMEOUT=2 # Seconds
# Default ports (auto-detected if busy)
DEFAULT_FORGEJO_PORT=3000
DEFAULT_MLFLOW_PORT=5000
# Start all services
./scripts/start_services.sh
# Check service status
./scripts/status_services.sh
# Stop all services
./scripts/stop_services.sh
Initial Setup (first time only):
- Start services:
./scripts/start_services.sh
- Visit Forgejo web interface (URL will be displayed)
- Complete the installation wizard:
- Database Type: Select SQLite3 (file-based, no server required)
- Leave other database settings as default
- Scroll down to the bottom of the configuration page
- Administrator Account: Fill out username and password for your admin user
- Click "Install Forgejo" to complete setup
- Registration is disabled by default for security
# Activate the development environment
source $BASE_DIR/activate.sh
# Now you can use MLflow CLI directly
mlflow --help
$HOME/llmops_services/
βββ forgejo/
β βββ bin/forgejo # Forgejo binary
β βββ data/gitea/ # Database and repositories
β βββ logs/ # Application logs
βββ mlflow/
β βββ tracking/ # MLflow tracking database
β βββ artifacts/ # Model artifacts
β βββ logs/ # Application logs
β βββ mlflow.db # SQLite database
βββ venv/ # Python virtual environment
βββ logs/ # Service startup logs
βββ activate.sh # Environment activation script
# Runtime files in project root:
βββ .forgejo.pid # Process ID (for shutdown)
βββ .forgejo.port # Port number (for access)
βββ .mlflow.pid # Process ID (for shutdown)
βββ .mlflow.port # Port number (for access)
Automated Setup (Recommended):
# Configure shell with aliases and enhanced prompt
./scripts/configure_shell.sh
# Apply changes
source ~/.bashrc
Manual Setup:
# Add aliases to ~/.bashrc or ~/.zshrc
alias llmops-start="/path/to/llmops_devstack/scripts/start_services.sh && source ~/llmops_services/.mlflow_env"
alias llmops-stop="/path/to/llmops_devstack/scripts/stop_services.sh"
alias llmops-status="/path/to/llmops_devstack/scripts/status_services.sh"
The configure_shell.sh
script automatically:
- Detects your shell (bash/zsh)
- Creates backup of existing configuration
- Adds LLM DevStack aliases with absolute paths
- Configures enhanced prompt with git branch, Python environment, and SLURM job info
- Safely updates existing configuration if run again
import mlflow
# MLflow will automatically use the local tracking server
mlflow.set_tracking_uri("http://localhost:5000")
# Track experiments
with mlflow.start_run():
mlflow.log_param("learning_rate", 0.01)
mlflow.log_metric("accuracy", 0.95)
mlflow.log_artifact("model.pkl")
# Clone repositories
git clone http://localhost:3000/username/repository.git
# Set up CI/CD in .forgejo/workflows/
# Push changes to trigger builds
Services won't start:
# Check if ports are available
netstat -an | grep :3000
netstat -an | grep :5000
# Check logs
tail -f $BASE_DIR/logs/forgejo.log
tail -f $BASE_DIR/logs/mlflow.log
Permission errors:
# Fix file permissions
chmod 700 $BASE_DIR/forgejo/data
chmod 700 $BASE_DIR/mlflow
Can't access services:
# Verify services are running
./scripts/status_services.sh
# Check network binding
ps aux | grep forgejo
ps aux | grep mlflow
- Service startup logs:
$BASE_DIR/logs/
- Forgejo logs:
$BASE_DIR/forgejo/logs/
- MLflow logs:
$BASE_DIR/mlflow/logs/
MIT License - see LICENSE file for details.
Create Backup:
# Backup all data with timestamped archive
./scripts/backup_data.sh
# Backups are stored in $BASE_DIR/backups/YYYYMMDD_HHMMSS/
# Includes manifest file with detailed inventory
Restore from Backup:
# Stop services first
./scripts/stop_services.sh
# Copy data back from backup directory
cp -r $BASE_DIR/backups/20240101_120000/mlflow/* $BASE_DIR/mlflow/
cp -r $BASE_DIR/backups/20240101_120000/forgejo/* $BASE_DIR/forgejo/data/
# Restart services
./scripts/start_services.sh
Clean Up Orphaned Processes:
# Interactive cleanup of MLflow processes
./scripts/cleanup_mlflow.sh
# Useful if services weren't stopped properly
The examples/
directory contains sample scripts demonstrating integration with HPC environments and common ML workflows:
gpu_check.sh
- SLURM job script for GPU availability checking with MLflow logging- See
examples/README.md
for detailed usage instructions
Script | Purpose | Usage |
---|---|---|
setup.sh |
Initial installation and configuration | ./scripts/setup.sh |
start_services.sh |
Start MLflow and Forgejo services | ./scripts/start_services.sh |
stop_services.sh |
Gracefully stop all services | ./scripts/stop_services.sh |
status_services.sh |
Check service status and URLs | ./scripts/status_services.sh |
configure_shell.sh |
Configure shell aliases and enhanced prompt | ./scripts/configure_shell.sh |
backup_data.sh |
Create timestamped backup of all data | ./scripts/backup_data.sh |
cleanup_mlflow.sh |
Clean up orphaned MLflow processes | ./scripts/cleanup_mlflow.sh |