-
Notifications
You must be signed in to change notification settings - Fork 227
Introduce standalone C parser for RBS with arena allocation #2398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Initial template for C structs Use allocator in node constructors
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]> Add linked list implementation Signed-off-by: Alexandre Terrasa <[email protected]> Type `Class#super_class` field Signed-off-by: Alexandre Terrasa <[email protected]> Type fields of `RBS::Types::Block` Signed-off-by: Alexandre Terrasa <[email protected]> Type `block` fields Signed-off-by: Alexandre Terrasa <[email protected]> Type `RBS::Types::Proc#self_type` field Signed-off-by: Alexandre Terrasa <[email protected]> Refactor `parse_function` Signed-off-by: Alexandre Terrasa <[email protected]> Copy value in `rbs_struct_to_ruby_value` Remove usages of `rbs_loc` from `parser.c` Extract `rbs_location.h` Migrate `RBS::Types::Function::Param` fields Signed-off-by: Alexandre Terrasa <[email protected]> Type `RBS::Types::UntypedFunction` fields Signed-off-by: Alexandre Terrasa <[email protected]> Type fields of `RBS::AST::TypeParam` Signed-off-by: Alexandre Terrasa <[email protected]> Type some more fields of `RBS::AST::Members::Attr` Signed-off-by: Alexandre Terrasa <[email protected]> Type fields in `RBS::AST::Members::MethodDefinition` Signed-off-by: Alexandre Terrasa <[email protected]> Type `RBS::AST::Directives::Use::SingleClause#new_name` Signed-off-by: Alexandre Terrasa <[email protected]> Type `RBS::Namespace#absolute` Signed-off-by: Alexandre Terrasa <[email protected]> Temporary handle nil types Signed-off-by: Alexandre Terrasa <[email protected]> Handle `bool` type Signed-off-by: Alexandre Terrasa <[email protected]> Type all fields of `RBS::Types::Variable` Signed-off-by: Alexandre Terrasa <[email protected]> Migrate `RBS::TypeName` Signed-off-by: Alexandre Terrasa <[email protected]> Migrate `parse_use_clauses` Signed-off-by: Alexandre Terrasa <[email protected]> Migrate `class_instance_name` Signed-off-by: Alexandre Terrasa <[email protected]> Handle overloads as a rbs_node_list Signed-off-by: Alexandre Terrasa <[email protected]> Remove more `builds_ruby_object_internally` flags Signed-off-by: Alexandre Terrasa <[email protected]> Invert `builds_ruby_object_internally` default value Signed-off-by: Alexandre Terrasa <[email protected]> Introduce `rbs_location_t` Signed-off-by: Alexandre Terrasa <[email protected]> Store C structs instead of Ruby `VALUE`s Introduce +rbs_ast_symbol_t and migrate to it Signed-off-by: Alexandre Terrasa <[email protected]> Remove ZzzTmpNotImplemented node Signed-off-by: Alexandre Terrasa <[email protected]> Remove one more instance of EMPTY_ARRAY Signed-off-by: Alexandre Terrasa <[email protected]> Migrate from VALUE array to rbs_node_list_t Signed-off-by: Alexandre Terrasa <[email protected]> Migrate `method_params` from taking a VALUE arrays Signed-off-by: Alexandre Terrasa <[email protected]> Migrate `parse_type_list` from taking a VALUE array Signed-off-by: Alexandre Terrasa <[email protected]> Forward all C-typed params as-is Get types on constructor params Handle mix of C types and Ruby VALUE Move Ruby object construction into `new` functions Conditionally construct `ruby_value` internally Type Attr* field `ivar_name` Signed-off-by: Alexandre Terrasa <[email protected]> Add `AST::Bool` Signed-off-by: Alexandre Terrasa <[email protected]> Use two less VALUE values Signed-off-by: Alexandre Terrasa <[email protected]> Use more instance of `bool` Signed-off-by: Alexandre Terrasa <[email protected]> Add Hash implementation Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]> Use C hash for `check_key_duplication` Signed-off-by: Alexandre Terrasa <[email protected]> Use C hash to represent Record fields Signed-off-by: Alexandre Terrasa <[email protected]> Migrate `memo` to using a C hash Signed-off-by: Alexandre Terrasa <[email protected]> Uses C hashes for keyword parameters Signed-off-by: Alexandre Terrasa <[email protected]> Remove parser call to `todo!` Signed-off-by: Alexandre Terrasa <[email protected]> Remove calls to `rbs_struct_to_ruby_value` Signed-off-by: Alexandre Terrasa <[email protected]> TMP symbol Signed-off-by: Alexandre Terrasa <[email protected]> Replace 2 fake nodes by one Signed-off-by: Alexandre Terrasa <[email protected]> Set fields for `Record::FieldType` Signed-off-by: Alexandre Terrasa <[email protected]> Make comment use a `rbs_ast_comment_t` instead of a `VALUE` Signed-off-by: Alexandre Terrasa <[email protected]> Add `rbs_ast_string_t` Add `rbs_ast_integer_t` Migrate `literal` to store C nodes Remove `cached_ruby_string` Remove useless templating stuff Signed-off-by: Alexandre Terrasa <[email protected]> Remove `cached_ruby_value` from `rbs_node_list` Signed-off-by: Alexandre Terrasa <[email protected]> Remove `cached_ruby_value` from `rbs_hash` Signed-off-by: Alexandre Terrasa <[email protected]> Add `rbs_string`, and use it for annotations Add `rbs_ast_symbol_t` to model symbols in the AST Co-Authored-By: Alexander Momchilov <[email protected]>
And rename it to `class_constants` to disambiguate it from `rbs_constant_id`, `rbs_constant_pool`, etc.
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]> Do not create comments using a VALUE Use a rbs_string instead Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]>
Signed-off-by: Alexandre Terrasa <[email protected]>
`rbs_node_destroy`, `rbs_hash_free`, `rbs_node_list_free` are only calling each other recursively without any real freeing logic. This is the result of previous efforts to allocate all nodes on the arena. So we don't need these functions anymore. Discovered while working on #41
Co-authored-by: Alexander Momchilov <[email protected]>
Co-authored-by: Alexander Momchilov <[email protected]>
|
Thank you all for this contribution! I really appreciate all of you. 👏 |
soutaro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to add a new inline annotation at #2443, and confirmed that I feel I can work on this codebase. 👍
|
Opened ruby/ruby#13237 to test if the new C code works with Ruby CI compilers. (I should have done the test before merging... 💦 ) |
Introduce standalone C parser for RBS with arena allocation
Introduce standalone C parser for RBS with arena allocation
RBS C Parser Library Refactoring
This PR refactors the RBS parser and related components into a standalone C library that no longer depends on the Ruby runtime. This architectural change enables direct integration with static analysis tools like Sorbet while potentially improving performance.
Sorbet's RBS support already runs on this new architecture and we haven't discovered any major issues around it.
This work was a collaborative effort by
Key Improvements
extfolder into a standalone C library with a clean API, which can now be embedded in non-Ruby tools without Ruby runtime dependency (e.g. Sorbet, JRuby)Enhanced Memory Management
Arena allocator handles all memory for parser objects, including parser itself, lexer, constant pool, strings...etc. When the parser is freed by calling
rbs_parser_free, the allocator will free all the objects it allocated. This eliminates the need to manually free individual objects and reduces the risk of memory leaks.Component Architecture
graph TD RubyClient[Ruby Client] --> RubyAPI[Ruby API] CClient[C Client] --> CAPI[C API] RubyAPI --> CExtension[C Extension] CExtension --> CLibrary CAPI --> CLibrary subgraph CLibrary[C Library] subgraph Parser1[Parser Instance 1] direction TB ConstantPool1[Constant Pool] Lexer1[Lexer] ArenaAllocator1[Arena Allocator] end subgraph Parser2[Parser Instance 2] direction TB ConstantPool2[Constant Pool] Lexer2[Lexer] ArenaAllocator2[Arena Allocator] end end subgraph "Public API" RubyAPI CAPI end %% Parser1 --> ConstantPool1 %% Parser1 --> Lexer1 %% Parser1 --> ArenaAllocator1 %% Parser2 --> ConstantPool2 %% Parser2 --> Lexer2 %% Parser2 --> ArenaAllocator2