When should the raw data pointer be used? #236

dougdew64 · 2023-12-21T03:07:04Z

dougdew64
Dec 21, 2023

As a learning exercise, I'm re-implementing llama2.cpp atop MLX. At the moment, I'm attempting to re-implement the RoPE logic.

It seems to me that the simplest way to implement the kind of array accesses of the RoPE logic (highlighted on the right side of the screenshot) is to just do pointer arithmetic based upon the MLX array raw data pointer.

Am I correct? If not, what is the best way?

awni · 2023-12-21T03:40:48Z

awni
Dec 21, 2023
Maintainer

You should basically never do manipulations using the raw data pointer unless you are implementing a kernel.

If you find your self in that situation the first thing to check is if there is an op that can be used (in your case some combination of take, gather, scatter, or slice (all [] style indexing in python uses those ops under the hood, nothing more.

If no or combination of ops fits the bill (which is pretty unusual at this point) then that means we are missing the corresponding kernel. In which case you should file an issue and if it's something we would add to core then you could contribute to it as well 😄 !

0 replies

awni · 2023-12-21T03:41:37Z

awni
Dec 21, 2023
Maintainer

PS Implementing llama using the cpp API is a great exercise, you will definitely learn it :).

0 replies

dougdew64 · 2023-12-21T03:54:46Z

dougdew64
Dec 21, 2023
Author

Thank you very much for the guidance @awni. I'll resume my effort tomorrow.

My goal is to have this stuff working by Friday.

Also, assuming that I get this stuff working, is this the kind of code which should be contributed as an example? If so, to which repo?

0 replies

awni · 2023-12-21T04:01:13Z

awni
Dec 21, 2023
Maintainer

Hi @dougdew64 what I think would be a better example is doing this but using a C++ version of the mlx.nn API. Since that does not exist yet :\ a more likely scenario is that some of your implementation could be part of that once we get around to adding it. I look forward to see how you're example looks once you have it up and running!

0 replies

dougdew64 · 2023-12-21T14:28:19Z

dougdew64
Dec 21, 2023
Author

Thanks @awni. I'll complete my current effort and share a performance comparison of my code with the code of llama2.c and llama2.cpp. I'm hoping to demonstrate that using MLX yields a performance improvement.

After that, I'll start over by using a C++ version of the mlx.nn API, as you suggested.

0 replies

dougdew64 · 2023-12-21T16:00:30Z

dougdew64
Dec 21, 2023
Author

I think that I'm misunderstanding how to use the various MLX array access operations such as slice. When converting the llama2.cpp code (shown on the right) to use MLX (shown on the left), I end up with some really verbose expressions. And, I haven't figured out which MLX operation(s) will yield an LVALUE which I can use to assign values into an array.

Please pardon my ignorance. I'm not a python developer (at least not yet) and am accustomed to doing pointer arithmetic.

0 replies

awni · 2023-12-21T16:07:43Z

awni
Dec 21, 2023
Maintainer

scatter is the right C++ op for that. You can update a slice of an array with another array. Also you might want to refer to our python llama implementation or instead of the one you are looking at.

Pretty much every op you see in there should have a direct and simple translation to the C++ API with the exception of bracket style slicing []. We also don't use it that much because it can be slow. For example our RoPE implementation uses a slice and a concatenation.

0 replies

dougdew64 · 2023-12-21T16:12:35Z

dougdew64
Dec 21, 2023
Author

Thank you very much for the guidance @awni. I'm laughing at myself getting my butt kicked by this exercise. Fortunately, it's fun.

0 replies

awni · 2023-12-21T16:17:23Z

awni
Dec 21, 2023
Maintainer

Another point, if you find yourself assigning to an array (particularly in C++) probably you are doing something wrong. The C++ API is designed to be functional, so you always make a new output array from inputs. (We manage memory use and sharing memory etc under the hood).

0 replies

dougdew64 · 2023-12-21T16:25:30Z

dougdew64
Dec 21, 2023
Author

You've anticipated a future question which I was going to ask. Thanks!

0 replies

dougdew64 · 2023-12-21T16:49:41Z

dougdew64
Dec 21, 2023
Author

I think that I will learn some numpy basics (https://numpy.org/doc/stable/user/basics.html) and then start over in my code.

0 replies

dougdew64 · 2023-12-21T22:04:08Z

dougdew64
Dec 21, 2023
Author

...what I think would be a better example is doing this but using a C++ version of the mlx.nn API.

@awni I'm going to do as you suggested and start over with a goal of implementing a C++ version of the mlx.nn API.

It dawned on me that by doing so, I'd be able to compare my llama results with the results generated by the already-existing python implementation, and could ask questions here when my results are different. That would be a much better support situation than I would have faced tomorrow when attempting to debug my llama2.cpp -> llama2.ccp+MLX implementation.

Thank you again for providing such great support. Very much appreciated!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When should the raw data pointer be used? #236

{{title}}

Replies: 12 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

When should the raw data pointer be used? #236

dougdew64 Dec 21, 2023

Replies: 12 comments

awni Dec 21, 2023 Maintainer

awni Dec 21, 2023 Maintainer

dougdew64 Dec 21, 2023 Author

awni Dec 21, 2023 Maintainer

dougdew64 Dec 21, 2023 Author

dougdew64 Dec 21, 2023 Author

awni Dec 21, 2023 Maintainer

dougdew64 Dec 21, 2023 Author

awni Dec 21, 2023 Maintainer

dougdew64 Dec 21, 2023 Author

dougdew64 Dec 21, 2023 Author

dougdew64 Dec 21, 2023 Author

dougdew64
Dec 21, 2023

awni
Dec 21, 2023
Maintainer

awni
Dec 21, 2023
Maintainer

dougdew64
Dec 21, 2023
Author

awni
Dec 21, 2023
Maintainer

dougdew64
Dec 21, 2023
Author

dougdew64
Dec 21, 2023
Author

awni
Dec 21, 2023
Maintainer

dougdew64
Dec 21, 2023
Author

awni
Dec 21, 2023
Maintainer

dougdew64
Dec 21, 2023
Author

dougdew64
Dec 21, 2023
Author

dougdew64
Dec 21, 2023
Author