added cuda float16->float32 upcasting to ggml_cuda_cpy #685

balisujohn · 2024-01-07T22:42:32Z

Really basic addition. This seems to work in tortoise.cpp, but I haven't tested it after copying it here other thank to see that it compiles. It wasn't obvious to me if there are tests for cuda ops in ggml where I could add a test for this behavior. Curious to hear if I should add a test somewhere and If I should lint etc..

slaren · 2024-01-08T03:28:48Z

You can add a test to test-backend-ops by adding a test case. Something like this should work:

test_cases.emplace_back(new test_cpy(GGML_TYPE_F16, GGML_TYPE_F32, {256, 10, 10, 1}));

ggml_backend_cuda_supports_op also needs to be updated to return true for this case, otherwise the test will be skipped as "not supported"

It seems that #686 also contains these changes, should we focus on that instead?

balisujohn · 2024-01-08T04:13:18Z

Yeah that works for me, since these are both improvements to ggml_cuda_cpy. I'll be away from my graphics card for a few days but once I'm back I'll look into adding a test for this and the 4d copy behavior.

added cuda float16->float32 upcasting to ggml_cuda_cpy

4bd90ec

balisujohn closed this Jan 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added cuda float16->float32 upcasting to ggml_cuda_cpy #685

added cuda float16->float32 upcasting to ggml_cuda_cpy #685

balisujohn commented Jan 7, 2024

slaren commented Jan 8, 2024

balisujohn commented Jan 8, 2024

added cuda float16->float32 upcasting to ggml_cuda_cpy #685

added cuda float16->float32 upcasting to ggml_cuda_cpy #685

Conversation

balisujohn commented Jan 7, 2024

slaren commented Jan 8, 2024

balisujohn commented Jan 8, 2024