Replies: 1 comment
-
You can accomplish this using
For example splitting a tensor with 2048 elements in half. The first int argument is the number of elements, the second argument is the byte offset. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I want using chatglm3-6b's model in llama.cpp, how can I do equivalent of torch.chunk in ggml tensor ?
this code from https://hf-mirror.com/THUDM/chatglm3-6b/blob/main/modeling_chatglm.py
Beta Was this translation helpful? Give feedback.
All reactions