Feature Request: Enable cuda 11.4 and cuda arch 3.7 #12140

ChunkyPanda03 · 2025-03-02T07:59:13Z

Prerequisites

I am running the latest code. Mention the version if possible as well.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

I would like more documentation and knowledge to be shared concerning that llama.cpp is able to be built for cuda 11.4 and cuda arch 3.7

Motivation

I was able to pick up some tesla k80s for 20$ each on ebay for other projects I will be doing and for their VRAM and over all cuda performance during what is now again inflating GPU prices I thought these cards might still be good.

Possible Implementation

I was able to compile and run llama.cpp for the Tesla k80 by downgrading gcc and g++ from 12 to 10 installing nvidia driver version 470.256.02 and cuda toolkit 11.4 . I was then able to build it by running cmake with these arguments

cmake -B build -DGGML_CUDA=ON -DCMAKE_CUDA_COMPILER=/usr/local/cuda-11.4/bin/nvcc -DCMAKE_CUDA_ARCHITECTURES='52;61;70;75;37'

While I understand you already support these older cards through vulkan (very cool big fan) I find that lots of performance is left on the table for these older tesla cards. running DeepSeek-R1-Distill-Qwen-7B-F16.gguf with vulkan I was able to achieve around 3 T/s but with cuda I was able to get around 5.5T/s to 6T/s with just 1tesla k80 that I bought for just 20$. And the best part is, I think it is faster I am currently bottle necked by my CPU (AMD opteron 6378) as the one core keeping the GPU fed is pinned to 100%.

Also please do not scoff at the 6T/s I am using this from the perspective that I am using not training ai, and that 6 T/s is impressively fast for 20$ but also adds the ability to add more gpus if you so desire.

Once again I am not asking for support on clearly deprecated hardware but instead discussion on workarounds and bug reports on these old platforms.

The text was updated successfully, but these errors were encountered:

ChunkyPanda03 added the enhancement New feature or request label Mar 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Enable cuda 11.4 and cuda arch 3.7 #12140

Feature Request: Enable cuda 11.4 and cuda arch 3.7 #12140

ChunkyPanda03 commented Mar 2, 2025

Feature Request: Enable cuda 11.4 and cuda arch 3.7 #12140

Feature Request: Enable cuda 11.4 and cuda arch 3.7 #12140

Comments

ChunkyPanda03 commented Mar 2, 2025

Prerequisites

Feature Description

Motivation

Possible Implementation