LLM Operations Development Stack

A secure, localhost-only infrastructure for LLM development and experimentation, featuring automated setup and management of MLflow for experiment tracking and Forgejo for version control and CI/CD. Includes convenient shell integration with enhanced prompts showing git branches, Python environments, and job status.

🎯 Purpose

This repository provides scripts and configuration for managing a local LLM development environment with:

MLflow: Experiment tracking, model versioning, and artifact management
Forgejo: Self-hosted Git service with CI/CD capabilities
Private & Secure: All services run locally with no external dependencies
Security First: Localhost-only access, secure configurations, and proper file permissions

🏗️ Architecture

The stack creates a self-contained development environment:

┌────────────────────────────────────────────────────────┐
│                LLM DevStack                            │
├────────────────────────────────────────────────────────┤
│  Forgejo (localhost:3000)     MLflow (localhost:5000)  │
│  ├─ Git repositories          ├─ Experiment tracking   │
│  ├─ CI/CD pipelines           ├─ Model registry        │
│  └─ Issue tracking            └─ Artifact storage      │
├────────────────────────────────────────────────────────┤
│              Local File System                         │
│  ├─ SQLite databases                                   │
│  ├─ Git repositories                                   │
│  ├─ ML artifacts                                       │
│  └─ Configuration files                                │
└────────────────────────────────────────────────────────┘

🚀 Quick Start

Prerequisites

macOS or Linux
Python 3.7+ for MLflow virtual environment
curl for downloading Forgejo binary
openssl for generating security keys
Homebrew (macOS only) - for Forgejo installation

Installation

Clone the repository:

git clone <repository-url>
cd llmops_devstack

Configure installation paths:

cp config.env.example config.env
# Edit config.env with your preferred installation directories (optional)

Run the setup script:
```
./scripts/setup.sh
```
Configure shell (optional):
```
./scripts/configure_shell.sh
source ~/.bashrc
```
Adds convenient aliases and enhanced prompt - see Shell Integration for details
Start the services:
```
./scripts/start_services.sh
```
Access the services:
- Forgejo: http://localhost:3000 (or displayed port)
- MLflow: http://localhost:5000 (or displayed port)

📋 Configuration

Default Configuration

The system uses sensible defaults that work out-of-the-box:

Installation Directory: $HOME/llmops_services
Network Binding: 127.0.0.1 (localhost only)
Default Ports: 3000 (Forgejo), 5000 (MLflow)
Auto Port Detection: Finds available ports if defaults are busy

Customization

Edit config.env to customize:

# Base installation directory
BASE_DIR="$HOME/llmops_services"

# Network security
FORGEJO_SERVER_HOST="127.0.0.1"    # Localhost only
MLFLOW_SERVER_HOST="127.0.0.1"     # Localhost only

# Service timeouts
GRACEFUL_SHUTDOWN_TIMEOUT=10        # Seconds
STATUS_CHECK_TIMEOUT=2              # Seconds

# Default ports (auto-detected if busy)
DEFAULT_FORGEJO_PORT=3000
DEFAULT_MLFLOW_PORT=5000

🔧 Usage

Service Management

# Start all services
./scripts/start_services.sh

# Check service status
./scripts/status_services.sh

# Stop all services
./scripts/stop_services.sh

Forgejo Setup

Initial Setup (first time only):

Start services: ./scripts/start_services.sh
Visit Forgejo web interface (URL will be displayed)
Complete the installation wizard:
- Database Type: Select SQLite3 (file-based, no server required)
- Leave other database settings as default
- Scroll down to the bottom of the configuration page
- Administrator Account: Fill out username and password for your admin user
- Click "Install Forgejo" to complete setup
Registration is disabled by default for security

Environment Activation

# Activate the development environment
source $BASE_DIR/activate.sh

# Now you can use MLflow CLI directly
mlflow --help

📁 Directory Structure

$HOME/llmops_services/
├── forgejo/
│   ├── bin/forgejo                 # Forgejo binary
│   ├── data/gitea/                 # Database and repositories
│   └── logs/                       # Application logs
├── mlflow/
│   ├── tracking/                   # MLflow tracking database
│   ├── artifacts/                  # Model artifacts
│   ├── logs/                       # Application logs
│   └── mlflow.db                   # SQLite database
├── venv/                           # Python virtual environment
├── logs/                           # Service startup logs
└── activate.sh                     # Environment activation script

# Runtime files in project root:
├── .forgejo.pid                    # Process ID (for shutdown)
├── .forgejo.port                   # Port number (for access)
├── .mlflow.pid                     # Process ID (for shutdown)
└── .mlflow.port                    # Port number (for access)

🛠️ Advanced Usage

Shell Integration

Automated Setup (Recommended):

# Configure shell with aliases and enhanced prompt
./scripts/configure_shell.sh

# Apply changes
source ~/.bashrc

Manual Setup:

# Add aliases to ~/.bashrc or ~/.zshrc
alias llmops-start="/path/to/llmops_devstack/scripts/start_services.sh && source ~/llmops_services/.mlflow_env"
alias llmops-stop="/path/to/llmops_devstack/scripts/stop_services.sh"
alias llmops-status="/path/to/llmops_devstack/scripts/status_services.sh"

The configure_shell.sh script automatically:

Detects your shell (bash/zsh)
Creates backup of existing configuration
Adds LLM DevStack aliases with absolute paths
Configures enhanced prompt with git branch, Python environment, and SLURM job info
Safely updates existing configuration if run again

MLflow Integration

import mlflow

# MLflow will automatically use the local tracking server
mlflow.set_tracking_uri("http://localhost:5000")

# Track experiments
with mlflow.start_run():
    mlflow.log_param("learning_rate", 0.01)
    mlflow.log_metric("accuracy", 0.95)
    mlflow.log_artifact("model.pkl")

Forgejo Integration

# Clone repositories
git clone http://localhost:3000/username/repository.git

# Set up CI/CD in .forgejo/workflows/
# Push changes to trigger builds

🔍 Troubleshooting

Common Issues

Services won't start:

# Check if ports are available
netstat -an | grep :3000
netstat -an | grep :5000

# Check logs
tail -f $BASE_DIR/logs/forgejo.log
tail -f $BASE_DIR/logs/mlflow.log

Permission errors:

# Fix file permissions
chmod 700 $BASE_DIR/forgejo/data
chmod 700 $BASE_DIR/mlflow

Can't access services:

# Verify services are running
./scripts/status_services.sh

# Check network binding
ps aux | grep forgejo
ps aux | grep mlflow

Log Locations

Service startup logs: $BASE_DIR/logs/
Forgejo logs: $BASE_DIR/forgejo/logs/
MLflow logs: $BASE_DIR/mlflow/logs/

📄 License

MIT License - see LICENSE file for details.

Data Backup

Create Backup:

# Backup all data with timestamped archive
./scripts/backup_data.sh

# Backups are stored in $BASE_DIR/backups/YYYYMMDD_HHMMSS/
# Includes manifest file with detailed inventory

Restore from Backup:

# Stop services first
./scripts/stop_services.sh

# Copy data back from backup directory
cp -r $BASE_DIR/backups/20240101_120000/mlflow/* $BASE_DIR/mlflow/
cp -r $BASE_DIR/backups/20240101_120000/forgejo/* $BASE_DIR/forgejo/data/

# Restart services
./scripts/start_services.sh

Process Management

Clean Up Orphaned Processes:

# Interactive cleanup of MLflow processes
./scripts/cleanup_mlflow.sh

# Useful if services weren't stopped properly

📚 Examples

The examples/ directory contains sample scripts demonstrating integration with HPC environments and common ML workflows:

gpu_check.sh - SLURM job script for GPU availability checking with MLflow logging
See examples/README.md for detailed usage instructions

🔧 Script Reference

Script	Purpose	Usage
`setup.sh`	Initial installation and configuration	`./scripts/setup.sh`
`start_services.sh`	Start MLflow and Forgejo services	`./scripts/start_services.sh`
`stop_services.sh`	Gracefully stop all services	`./scripts/stop_services.sh`
`status_services.sh`	Check service status and URLs	`./scripts/status_services.sh`
`configure_shell.sh`	Configure shell aliases and enhanced prompt	`./scripts/configure_shell.sh`
`backup_data.sh`	Create timestamped backup of all data	`./scripts/backup_data.sh`
`cleanup_mlflow.sh`	Clean up orphaned MLflow processes	`./scripts/cleanup_mlflow.sh`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Operations Development Stack

🎯 Purpose

🏗️ Architecture

🚀 Quick Start

Prerequisites

Installation

📋 Configuration

Default Configuration

Customization

🔧 Usage

Service Management

Forgejo Setup

Environment Activation

📁 Directory Structure

🛠️ Advanced Usage

Shell Integration

MLflow Integration

Forgejo Integration

🔍 Troubleshooting

Common Issues

Log Locations

📄 License

Data Backup

Process Management

📚 Examples

🔧 Script Reference

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
examples		examples
scripts		scripts
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
config.env.example		config.env.example

License

CUD2V/llmops_devstack

Folders and files

Latest commit

History

Repository files navigation

LLM Operations Development Stack

🎯 Purpose

🏗️ Architecture

🚀 Quick Start

Prerequisites

Installation

📋 Configuration

Default Configuration

Customization

🔧 Usage

Service Management

Forgejo Setup

Environment Activation

📁 Directory Structure

🛠️ Advanced Usage

Shell Integration

MLflow Integration

Forgejo Integration

🔍 Troubleshooting

Common Issues

Log Locations

📄 License

Data Backup

Process Management

📚 Examples

🔧 Script Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages