You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some parts of the model file will remain mapped. The same thing happens when partially offloading a model, in practice it is not likely to cause issues because the OS can reclaim that memory if necessary. Disabling mmap with --no-mmap should avoid this.
Name and Version
b4792
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-cli
Command line
Problem description & steps to reproduce
When loading the model, it uses 4.3GB of RAM
When using Q4_K_S (similar size) it only uses 2.7GB of RAM
First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: