Replies: 5 comments 1 reply
-
Hey hey! It's not a problem at all! 100% of this is just like, vibes and suggested ideas, I am not married to anything at all, and also, what's in the repo may not even live up to the plan. A lot of the stuff you read is like, a mental dump of "stuff I want to try out," and so hearing about what may work and what may cause problems is 100% welcome, and I appreciate it. I'm going to actually read your comment now, but I mostly just first wanted to say that, because it is important on its own. |
Beta Was this translation helpful? Give feedback.
-
I'm glad someone already tried this! This was something I was wondering about.
Yes! I recently watched this talk, and definitely wanted to see what I could take from that. I hadn't tried to port any of those ideas to Rust though, so I appreciate the sketch, I'm sure it'll be helpful.
It's no worries either way, I'm unsure how much contribution I want to take on this project right now, because this is largely a learning experience + passion project for me at the moment, but if you do take a crack at this, I'd at least like to take a look at it!
Yep. As always, there's a ton of things to do with a baby language, and so I haven't figured out at what point I want to do each step. First round is just to get anything going at all, and I don't want to implement an entire language and then re-write its parser, I'm also worried about doing this kind of thing too early when measurements aren't "real", know what I mean? Regardless, the current state of things is very much in a more classic style, and even a little sloppy: I've been prioritizing shipping anything at all over something that perfectly fits my end plan just yet, at this nascent stage of the language existing, there's just so little going on that I want to make things a bit more interesting, language-wise, before I really focus on getting these sorts of details in place in a more real way. |
Beta Was this translation helpful? Give feedback.
-
Glad to hear that; I was a little worried, somehow.
Ah, yeah don't worry, I didn't mean direct contributions: what I meant was that I should be implementing a SoAVec like struct (specifically one that can only contain up to 2^32 items) in the context of my JavaScript engine project, and it will then likely be available on crates.io. So, hopefully I'll have something that you can indeed take a look at :)
I absolutely understand (the JS engine is very much the same; trying things out, just getting things out the door before making sure they're necessarily the best thing ever). I wish you a good time with this project, cheers! <3 |
Beta Was this translation helpful? Give feedback.
-
Excellent :) |
Beta Was this translation helpful? Give feedback.
-
|
I was chatting about this with Claude earlier: fun! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello Mr. Klabnik,
My apologies for barging in like this. I hope this is not badly received, but I was reading your technical decisions / architectural documentation and the "ECS-inspired design with separate arrays" popped out to me as a possible problem. It's likely that it will prove to be a cache-unfriendly data structure for the AST, instead of being cache-friendly like you list in the rationale.
You might have also seen a recent DConf talk creating a data-oriented parser, AST, and visitor generator program. They performed an ECS transformation on the AST node storage and found it reduced performance by 20%. The reason seems to be fairly clear: they have more cache misses after the ECS transformation.
The reason is (likely) that the common case for the AST is to read the data of the next node, regardless of what its type is. With the ECS transformation done to separate node data based on the node type, your memory layout optimises for reading the next node of the same type. Having multiple nodes of the same type one after the another is a rare, exceptional case so this will likely lead to a lot of cache misses.
The way you'd likely want to perform the ECS transformation instead is to take the
Vec<ASTNode>and transform that into something like this:ie. the AST nodes are split into a single Struct-of-Arrays vector with the node types in one array, common node data in another array1, and then type-specific node data in yet other arrays2. A final, separate spill vector is kept on the side for any time a node needs a variable number of data, or just needs more data than can be fit into the static ones3, and in these cases the static slots contain an index into the spill vector whence the AST node can read the rest of its data (the spill vector index can start with a dynamic length to tell how much data the AST node should read). (This is just copying Andrew Kelley / Ziglang's AST structure, see Practical DOD for details.)
With this, the memory layout of the AST should be a lot cache friendlier: when an AST node's type is read, the next one's type is the next byte over. Likewise, the common data for both nodes are one after the other, as are the unique / type-dependent data. Extracting the unique data does now require
unsafe {}(or you could rewrite the unique data to be just u32s or such, reinterpreted based on the tag for less unsafe usage but the result is much the same), but it should be pretty easy to keep this completely on the up-and-up. Finally, spill data location becomes harder to predict for the CPU but it should be both rare, and cache local: the previous spill data ends where the next one begins, so there's still a good chance that the next node needing spill data will find its spill data hot in the cache even though there may have been multiple AST nodes without spill data between it and the previous node with spillage.There; I've said my piece. As said, I hope this is not received badly: I absolutely do not aim to disparage or make fun of your ECS transformation (or plans for one? I didn't see it in the code so I assume this is planning-stages for now). Much rather, I am very much in favour of it and want it to succeed! I may even offer my meager aid to the effort at some point, as I am in need of a
SoAVectype that would, preferably, work without (much) macros. But, I couldn't help but notice that the recent DConf talk had had a negative result with precisely this kind of "vector per node type" transformation and I hope you won't end up finding the same result.Best regards,
-Aapo Alasuutari
Footnotes
Footnotes
I've split the common data into two arrays here; whether it actually makes sense to split is hard to tell and would need to be measured to really know for certain. But eg. splitting the offset or line&column data off from the rest may well make sense, as those are presumably only really accessed during error recovery and reporting. ↩
Here it is very likely that at least two or maybe three arrays makes sense: some common node types will only need one or two "data slots", so fusing all data slots together into a single union would introduce wasted padding in those cases. ↩
If a few relatively rare node types would require eg. five slots, it probably makes sense to move their extra data into the spill section rather than introduce five static arrays, something like half of which are mostly empty because other nodes don't need those "slots". ↩
Beta Was this translation helpful? Give feedback.
All reactions