-
Notifications
You must be signed in to change notification settings - Fork 2
feat: complete c parser #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 53 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
51 | 2 | 0 | 0 |
Click to see the invalid file list
- src/lang/cxx/lib.go
- src/lang/cxx/spec.go
src/lang/cxx/lib.go
Outdated
@@ -0,0 +1,27 @@ | |||
package cxx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
package cxx | |
// Copyright 2025 CloudWeGo Authors | |
// | |
// Licensed under the Apache License, Version 2.0 (the "License"); | |
// you may not use this file except in compliance with the License. | |
// You may obtain a copy of the License at | |
// | |
// http://www.apache.org/licenses/LICENSE-2.0 | |
// | |
// Unless required by applicable law or agreed to in writing, software | |
// distributed under the License is distributed on an "AS IS" BASIS, | |
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
// See the License for the specific language governing permissions and | |
// limitations under the License. | |
package cxx |
src/lang/cxx/spec.go
Outdated
@@ -0,0 +1,190 @@ | |||
package cxx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
package cxx | |
// Copyright 2025 CloudWeGo Authors | |
// | |
// Licensed under the Apache License, Version 2.0 (the "License"); | |
// you may not use this file except in compliance with the License. | |
// You may obtain a copy of the License at | |
// | |
// http://www.apache.org/licenses/LICENSE-2.0 | |
// | |
// Unless required by applicable law or agreed to in writing, software | |
// distributed under the License is distributed on an "AS IS" BASIS, | |
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
// See the License for the specific language governing permissions and | |
// limitations under the License. | |
package cxx |
resolve the conflicts first @Hoblovski |
1. Since clangd does not support semanticTokens/range method, use semanticTokens/full + filtering to emulate. 2. Since the concept of package and module does not apply to C/C++, treat the whole repo as a single package/module.
5011265
to
83847e7
Compare
lang/parse.go
Outdated
@@ -94,6 +95,9 @@ func checkRepoPath(repoPath string, language uniast.Language) (openfile string, | |||
case uniast.Rust: | |||
// NOTICE: open the Cargo.toml file is required for Rust projects | |||
openfile, wait = rust.CheckRepo(repoPath) | |||
case uniast.Cxx: | |||
// NOTICE: open the Cargo.toml file is required for Rust projects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
注释改一下吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
lang/lsp/lsp.go
Outdated
} | ||
|
||
func filterSemanticTokensInRange(resp *SemanticTokens, r Range) { | ||
// LSP starts from 0:0 but the project seems to use 1:1 (see collect PositionOffset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里应该统一改成0了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不好意思…… 删掉注释了
} | ||
|
||
// returns: mod, path, error | ||
func (c *CxxSpec) NameSpace(path string) (string, string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c也支持多目录作为不同的命名空间。这个实现好像没法支持?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
具体来说,就是同名symbol在不同文件夹下(没有相互includes)。这个如何在ast里面区分清楚?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C 的目录不影响编译,只是存放位置。比如 main.c:foobar 和 driver/name/lib.c:foobar 在 C 看来都是 foobar,不存在 main.foobar 和 driver.name.lib.foobar。不过目录可以作为给大模型的启发式信息,先 todo 一下?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
具体来说,就是同名symbol在不同文件夹下(没有相互includes)。这个如何在ast里面区分清楚?
C 会编译错误。
$ for i in 1 2; do mkdir d$i && echo "int add(int a){return a+1;}" > d$i/add.c ; done
$ echo "extern int add(int); int main(int argc,char**argv){return add(argc);}" > main.c
$ gcc **/*.c
/usr/bin/ld: /tmp/ccilArbh.o: in function `add':
add.c:(.text+0x0): multiple definition of `add'; /tmp/ccorwzKC.o:add.c:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不过实际中可能有,比如需要支持不同平台 …… 可以暂时先通过 build_commands.json 规避(就是忽略一些 c 文件?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不同模块可能存放在同一个仓库下。得想清楚这种情况怎么处理。要么解析时候需要指定实际的编译模块,要么将所有编译模块都列举出来
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok,我觉得应该是解析的时候指定实际的编译模块比较好。linux 内核中有大量此类情况(例如 x86 和 arm 目录下会有同名文件,但是不会同时编译进一个内核镜像)。用 vscode 写 linux 内核也是使用 compile_commands.json 指定实际有那些模块 [1] [2] [3]。我把论文测试跑完可以给个更完备的描述
[1] https://zhuanlan.zhihu.com/p/558286384
[2] https://gist.github.com/itewqq/4b4ee89ba420d585efb472116879b1ee
[3] https://github.com/amezin/vscode-linux-kernel
C allows symbols with the same name in a single module, provided either: * One is a weak symbol (decl) and one is a strong symbol (def) * They are both strong symbols, but never linked together. The first one works fine, but more changes are needed for the second one. testdata/cxxsimple illustrates the first scenario. Two instances of `myself` are present, one (weak) in `pair.h` and one (strong) in `pair.c`. The dependency is well defined in this scneario: 1. `pair.c:myself` depends on `pair.h:myself` 2. any other function using `myself` depends on both. To verify, run `./abcoder parse cxx testdata/cxxsimple > cxxsimple.json`. testdata/cxxduplicate is the second scenario. Two strong instances of `add` are present, each used in a different executable. clangd handles this with compile_commands.json. If clangd is invoked as below, the `main->add` dependency shall point to the `add` in `d1/add.c`. mkdir build && cd build && cmake .. bear -- make prog1 # generate compile_commands.json cd testdata/cduplicate && clangd-18 While clangd does the right job, the current implementation of scanning during collection does not take into account which files are included in a compilation (as specified in compile_commands.json). So `Collector.Collect` will incorrectly include `d2/add.c` even if it is not used, and mess up with dependencies. That is to say, even for the compilation `prog1 <- main.c, d1/add.c`, a dependency `main->d2/add.c:add` will be present.
What type of PR is this?
实现 C parser 到 AST。
考虑到 C 语言无模块,所以只有一个模块。
Check the PR title.
(Optional) Translate the PR title into Chinese.
(Optional) More detailed description for this PR(en: English/zh: Chinese).
en:
zh(optional):
(Optional) Which issue(s) this PR fixes:
(optional) The PR that updates user documentation: