linux GPU usage maxed at 15%? #1401

JustPlaneTastic · 2023-12-13T16:17:23Z

JustPlaneTastic
Dec 13, 2023

So, first off, I love privateGPT. Greate work is going on here. Thank you.

I am attempting to make use of it, but things have been quite slow for me. I've read through a few issues as well as multiple reinstalls, and a few model tests and I can't quite get the performance I am looking for and I am hoping I am just missing something rather than it is a functional issue.

I am running these in order to implement:

CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python
PGPT_PROFILES=local make run

privateGPT loads up with BLAS = 1
and shows that it see's the following details on the card:

llm_load_tensors: ggml ctx size = 0.11 MiB
llm_load_tensors: using CUDA for GPU acceleration
llm_load_tensors: mem required = 70.42 MiB
llm_load_tensors: offloading 32 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 33/33 layers to GPU
llm_load_tensors: VRAM used: 4095.06 MiB
...............................................................................................
llama_new_context_with_model: n_ctx = 4000
llama_new_context_with_model: freq_base = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_new_context_with_model: KV self size = 500.00 MiB, K (f16): 250.00 MiB, V (f16): 250.00 MiB
llama_build_graph: non-view tensors processed: 676/676
llama_new_context_with_model: compute buffer total size = 284.88 MiB
llama_new_context_with_model: VRAM scratch buffer: 281.82 MiB
llama_new_context_with_model: total VRAM used: 4376.88 MiB (model: 4095.06 MiB, context: 281.82 MiB)
AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |

When I test prompts and questions I regularly see the CPU pegged, but nvtop is only showing (at the most) a 15% usage of my GPU, but most of the time in the single digits.

I've seen reference to some .env files, and privateGPT.py adjustments, but I am not seeing those files in my installation in order to edit them. Surely I have a config adjustment I am missing.

Any pointers would be appreciated.
Thanks

Before writing this questions I read through these as well as the documentation for privateGPT.
#217
#425
maozdemir#2

lolo9538 · 2024-02-12T18:24:14Z

lolo9538
Feb 12, 2024

Hi, I'm on Windows but have the same issue, any chance you found out the solution? Thanks

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

linux GPU usage maxed at 15%? #1401

{{title}}

Replies: 1 comment

{{title}}

Select a reply

linux GPU usage maxed at 15%? #1401

JustPlaneTastic Dec 13, 2023

Replies: 1 comment

lolo9538 Feb 12, 2024

JustPlaneTastic
Dec 13, 2023

lolo9538
Feb 12, 2024