Skip to content

Releases: mara-werils/llmstack

v1.0.0 — The Open-Source LLM Platform

09 May 11:04
dbcfbc0

Choose a tag to compare

llmstack v1.0.0

The open-source LLM platform that actually saves you money. Four major features ship in this release:

Universal LLM Gateway

  • Route requests across 6 cloud providers (OpenAI, Anthropic, Google, Groq, Together, Mistral) + local Ollama/vLLM through a single OpenAI-compatible endpoint
  • Cost-aware routing — automatically picks the cheapest model for each query tier
  • Fallback chains — automatic failover between providers on errors
  • Per-provider cost tracking with `X-Cost-USD` response headers
  • Format translation: Anthropic Messages API, Google Gemini → OpenAI format

AI Agents & MCP Server

  • ReAct agent loop with 6 built-in tools: `read_file`, `write_file`, `list_directory`, `grep`, `shell`, `http_get`
  • `llmstack agent "task"` — complete tasks autonomously with tool use
  • MCP server (`llmstack mcp`) — expose tools and LLM inference to Claude Code, Cursor, VS Code
  • 8 MCP tools including `llmstack_chat` and `llmstack_ask` (file RAG)

One-Command Fine-tuning

  • `llmstack finetune data.jsonl --export-ollama my-model`
  • Auto-detect format (CSV, JSON, JSONL, TXT, Parquet) and columns
  • Auto hyperparameters based on dataset size and model
  • Dual backend: unsloth (2x faster) or HuggingFace PEFT/TRL
  • GGUF export + Ollama model creation

AI-Native Observability

  • Quality scoring on every response (coherence, relevance, refusal, toxicity, repetition)
  • Drift detection — automatic alerts when quality degrades
  • A/B testing — compare models with statistical confidence
  • Request tracing — full lifecycle traces with quality scores
  • `llmstack eval` CLI for dataset evaluation and live gateway monitoring

Stats

  • +7,860 lines of new code
  • 448 tests, 0 failures
  • 14 CLI commands
  • 12 API endpoints

Install

```bash
pip install llmstack-cli
```

Full changelog

New commands: `agent`, `mcp`, `finetune`, `eval`

New modules: `gateway/providers/` (7 adapters), `agent/` (tools + loop), `mcp/` (JSON-RPC server), `finetune/` (data + training + eval + export), `observe/` (scoring + traces + tracker + A/B)

Config: Added `providers`, `agents`, `mcp`, `finetune` sections to `llmstack.yaml`. Extended `observe` with quality tracking settings.

Dependencies: Added optional `[finetune]` extra for PyTorch/PEFT/TRL.

v0.2.0 — Interactive Chat + Docker Compose Export

07 May 14:54
e992f28

Choose a tag to compare

What's New

Interactive Terminal Chat

llmstack chat

Stream responses from your local LLM directly in the terminal. Supports conversation history, /clear to reset, Ctrl+C to quit.

Docker Compose Export

llmstack export
# Exported 7 services to docker-compose.yml
# Run with: docker compose up -d

Generate a standalone docker-compose.yml from your llmstack.yaml. Share with your team — no llmstack dependency required.

Bug Fixes

  • Gateway Docker image now builds locally (no longer requires ghcr.io)
  • Prometheus and Grafana configs are written to disk before container start
  • Generated API keys persist to llmstack.yaml across restarts
  • Clear error messages for port conflicts

Stats

  • 50 tests, 0 lint errors
  • 8 CLI commands: init, up, down, status, chat, export, logs, doctor

Install

pip install llmstack-cli

Full docs: https://github.com/mara-werils/llmstack

v0.1.0 — Initial Release

07 May 14:11

Choose a tag to compare

llmstack v0.1.0 — Initial Release

One command. Full LLM stack. Zero config.

What's included

  • CLI with init, up, down, status, logs, doctor commands
  • Auto hardware detection — NVIDIA, Apple Silicon, CPU
  • Smart backend resolver — auto-picks Ollama or vLLM based on your GPU
  • Services: Ollama, vLLM, Qdrant, Redis, TEI (Text Embeddings Inference)
  • API Gateway: OpenAI-compatible proxy with auth, rate limiting, SSE streaming
  • Observability: Prometheus + Grafana with pre-provisioned dashboard
  • Plugin system via Python entry_points
  • Presets: chat, rag, agent
  • Pydantic v2 config schema (llmstack.yaml)
  • Docker SDK orchestration (no docker-compose dependency)
  • CI/CD: GitHub Actions for lint/test and PyPI release

Install

pip install llmstack-cli

Quick start

llmstack init
llmstack up

Full docs: https://github.com/mara-werils/llmstack#readme