A snap package for Lemonade Server - a lightweight, high-performance local AI inference server with an OpenAI-compatible API.
sudo snap install lemonade-server- OpenAI-compatible API - Drop-in replacement for OpenAI API endpoints
- Multiple backends - Vulkan, ROCm (AMD GPUs), and CPU support
- Automatic model management - Downloads and caches models from Hugging Face
- Background service - Runs automatically on system startup
- Hardware acceleration - Optimized for AMD GPUs with ROCm support
| Backend | Hardware | Architecture |
|---|---|---|
| Vulkan | Any Vulkan-capable GPU | - |
| ROCm | AMD RX 7000 series (RDNA3) | gfx110X |
| ROCm | AMD RX 9000 series (RDNA4) | gfx120X |
| ROCm | AMD Strix Point APUs | gfx1150 |
| ROCm | AMD Strix Halo APUs | gfx1151 |
| CPU | Any x86_64 processor | - |
After installation, the server starts automatically and listens on port 13305.
The snap automatically detects your GPU and selects the best backend:
- AMD GPUs - Uses ROCm for optimal performance
- Other GPUs - Uses Vulkan
Vulkan support is provided by the mesa-2404 content snap, which is installed
automatically. If it isn't connected, connect it manually:
sudo snap connect lemonade-server:gpu-2404 mesa-2404:gpu-2404
sudo snap restart lemonade-server.daemonsudo snap services lemonade-serversudo snap logs -f lemonade-server.daemoncurl http://localhost:13305/api/v1/modelscurl -X POST http://localhost:13305/api/v1/load \
-H "Content-Type: application/json" \
-d '{"model_name": "Llama-3.2-3B-Instruct-GGUF"}'curl -X POST http://localhost:13305/api/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Llama-3.2-3B-Instruct-GGUF",
"messages": [{"role": "user", "content": "Hello!"}]
}'curl http://localhost:13305/api/v1/modelsModels are cached in /var/snap/lemonade-server/common/.cache/huggingface/.
To use models from your home directory or removable media, the snap has access to:
$HOME(via thehomeplug)/mediaand/mnt(via theremovable-mediaplug)
# Stop the service
sudo snap stop lemonade-server.daemon
# Start the service
sudo snap start lemonade-server.daemon
# Restart the service
sudo snap restart lemonade-server.daemon
# Disable autostart
sudo snap stop --disable lemonade-server.daemonlemonade-server --helpLemonade Server is compatible with any application that supports the OpenAI API:
- Open WebUI - Set API base URL to
http://localhost:13305/api/v1 - AnythingLLM - Configure as OpenAI-compatible endpoint
- Continue (VS Code) - Use OpenAI provider with custom base URL
- LangChain - Use
ChatOpenAIwithbase_url="http://localhost:13305/api/v1"
Check the logs for errors:
sudo journalctl -u snap.lemonade-server.daemon.service --no-pager -n 50Ensure the gpu-2404 interface is connected:
sudo snap connect lemonade-server:gpu-2404 mesa-2404:gpu-2404
sudo snap restart lemonade-server.daemonThe snap runs in strict confinement. If you need to access files outside the allowed locations, you may need to copy them to your home directory first.
Check your network connection and ensure you have enough disk space:
df -h /var/snap/lemonade-server/common/git clone https://github.com/lemonade-sdk/lemonade-server.git
cd lemonade-server
snapcraft
sudo snap install --dangerous lemonade-server_*.snapSnap packaging is GPL-3 License - See LICENSE for details.