llama-server-watchdog

A lightweight PowerShell watchdog that keeps llama-server alive on Windows.

Version: 1.1.0 Monitors the health endpoint and automatically restarts the process when it becomes unresponsive.

Why

llama-server (llama.cpp) on AMD GPUs has a known stability issue where the process crashes or becomes unresponsive after idle periods. This is documented across multiple GitHub issues (llama.cpp #10227, ollama #4492, ROCm #2625) and is related to AMD GPU drivers losing the device during idle. This watchdog catches those failures and restarts automatically.

How it works

Pings http://localhost:8080/health every 2 minutes
After 2 consecutive failures (4 minutes unresponsive), kills and restarts llama-server
Logs all activity to watchdog.log with timestamps
Auto-rotates the log at 5MB

Configuration

Edit the top of llama-watchdog.ps1 to match your setup:

$LlamaDir: path to your llama.cpp install (default: C:\llama-cpp)
$ModelPath: relative path to your model file
$Port: server port (default: 8080)
$CheckInterval: seconds between checks (default: 120)
$MaxFailures: consecutive failures before restart (default: 2)

Install

Copy llama-watchdog.ps1 to your llama.cpp directory
Open PowerShell as Administrator and run:

$action = New-ScheduledTaskAction -Execute "powershell.exe" -Argument "-ExecutionPolicy Bypass -WindowStyle Hidden -File C:\llama-cpp\llama-watchdog.ps1"
$trigger = New-ScheduledTaskTrigger -AtLogon
$settings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries -ExecutionTimeLimit 0 -RestartCount 3 -RestartInterval (New-TimeSpan -Minutes 1)
$settings.DisallowStartIfOnBatteries = $false
Register-ScheduledTask -TaskName "LlamaWatchdog" -Action $action -Trigger $trigger -Settings $settings -Description "Monitors llama-server health and auto-restarts on failure" -RunLevel Highest

Disable any existing llama-server scheduled task:

Disable-ScheduledTask -TaskName "llama-server"

Start it:

Start-ScheduledTask -TaskName "LlamaWatchdog"

Check status

Get-Content C:\llama-cpp\watchdog.log

Tested on

Windows 11
AMD RX 9070 XT (16GB VRAM)
Qwen 3.5 9B Q8_0 quantization
32K context window

Known AMD GPU stability issues

llama.cpp #10227: Server slows down over time on RX 7900 XT
ollama #4492: Crashes after idle on AMD GPUs
ROCm #2625: 100% GPU usage during idle with HIP streams on RDNA3
AMD driver release notes acknowledge intermittent crashes on 9000 series

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
llama-watchdog.ps1		llama-watchdog.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llama-server-watchdog

Why

How it works

Configuration

Install

Check status

Tested on

Known AMD GPU stability issues

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

llama-server-watchdog

Why

How it works

Configuration

Install

Check status

Tested on

Known AMD GPU stability issues

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages