This document defines how LocalAIStack maps hardware characteristics to allowed software capabilities.
Policies act as a safety and predictability layer between hardware detection and software installation.
Policies define constraints and permissions, not actions.
If a capability cannot be reliably supported, it is disabled by default.
User overrides are allowed but always explicit and traceable.
Policies consume normalized hardware profiles:
- CPU cores and topology
- System RAM
- GPU count
- GPU VRAM
- GPU interconnects (e.g. NVLink)
Policies operate on the following dimensions:
- Maximum model size
- Allowed inference runtimes
- Parallel execution limits
- Memory and VRAM utilization ceilings
| Tier | Typical Capability |
|---|---|
| Tier 1 | ≤14B inference |
| Tier 2 | ≈30B inference |
| Tier 3 | ≥70B / multi-GPU |
Tiers are policy-derived, not hardcoded.
policies:
- name: tier2-default
conditions:
gpu_vram_min: 32GB
ram_min: 64GB
allow:
max_model_size: 30B
runtimes:
- llama.cpp
- vllm
deny:
- multi_gpu_training- Detect hardware
- Normalize profile
- Evaluate matching policies
- Merge allowed capabilities
- Expose effective capability set
- Most restrictive rule wins
- Explicit denies override allows
- User overrides require confirmation
Overrides are:
- Local-only
- Versioned
- Reversible
Overrides never modify base policy definitions.
- Policies do not optimize performance
- Policies do not schedule workloads
- Policies do not manage runtime behavior
Policy mapping ensures that:
- Users are not exposed to unsafe configurations
- Software availability matches hardware reality
- LocalAIStack remains predictable across machines