Skip to content

ashish993/Awesome-Local-LLMs-Guide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome Local LLMs Guide

Run AI models privately on your own hardware — offline, free, under your control.

Why Run Locally?

Concern Cloud API Local
Privacy Data leaves machine 100% local
Cost Per-token billing Hardware only
Internet Required Not needed

Hardware by Model Size

Size Min VRAM (Q4) Speed (t/s)
3B 2GB 60-120
7B 4-5GB 30-60
13B 8GB 15-30
70B 40GB 2-8

Tools

Tool Best For API
Ollama Easiest setup OpenAI-compatible REST
LM Studio GUI desktop OpenAI-compatible REST
llama.cpp Max performance CLI
vLLM Production serving REST

Quick Start with Ollama

curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.2
ollama run llama3.2

Model Recommendations

Use Case Model Size
Fast chat Llama 3.2 3B 2GB
Quality chat Mistral 7B 4GB
Code DeepSeek Coder 6.7B 4GB

Quantization Guide

Format Size vs FP16 Quality
Q8_0 50% Minimal loss
Q4_K_M 28% Sweet spot
Q2_K 14% Noticeable loss

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages