Attempting to run a persimmon model with the CUDA backend fails an assertion in ggml_cuda_rope: `ggml_is_contiguous(src0)` ref https://github.com/ggerganov/llama.cpp/pull/5668#issuecomment-1959988387