feat(logs): add Rust FFI tokenizer bridge with FlatBuffers#49216
Conversation
9562664 to
ad3c7b3
Compare
678a3d5 to
022a743
Compare
ad3c7b3 to
ff26d66
Compare
022a743 to
bf7d7b9
Compare
bf7d7b9 to
aa058a0
Compare
ff26d66 to
9957229
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9957229d8f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "otlp", | ||
| "podman", | ||
| "python", | ||
| "rust_patterns", |
There was a problem hiding this comment.
Exclude rust_patterns from default Windows agent tags
Adding rust_patterns to the default AGENT_TAGS enables the Rust tokenizer bridge on Windows builds, but this commit does not vendor a Windows libpatterns artifact (the new vendor/README.md marks windows_amd64 as "Not yet built") while tokenizer.go still links -lpatterns from vendor/windows_amd64. In practice, default Windows agent builds can fail at link time with cannot find -lpatterns unless developers manually prepare platform-specific artifacts; this tag should be gated off on Windows (or enabled only when a Windows binary is present).
Useful? React with 👍 / 👎.

What does this PR do?
Adds the Rust FFI tokenizer bridge that connects the Go agent to the
patternsRust library via cgo + FlatBuffers:pkg/logs/patterns/tokenizer/rust/): cgo bindings tolibpatterns, FlatBuffers schema for zero-copy token serialization,TokenConversionto map Rust tokens → Gotoken.Tokentypeslibpatterns.{dylib,so}for darwin/linux × amd64/arm64, plus the C headertasks/patterns.py): Invoke tasks for compiling the Rust library, running benchmarks, and managing vendor artifactstasks/build_tags.py,tasks/agent.py):patternsbuild tag gatingMotivation
The Rust
patternslibrary provides high-performance log tokenization (signature extraction, pattern detection). This bridge lets the Go agent call into it without re-implementing the tokenizer in Go, using FlatBuffers to minimize serialization overhead.Describe how you validated your changes
rust_*_test.go)tasks/patterns.pysuccessfully builds and vendors the library on macOS arm64How to Review this PR
tokenizer.go— the mainRustTokenizerstruct and itsTokenize()methodtoken_conversion.gofor the FlatBuffers → Go token mappingflatbuffers/patterns_tokenizer.fbsfor the serialization schematasks/patterns.pyfor the build/vendor workflowvendor/directory contains pre-built binaries — verify the README for provenanceAdditional Notes
The
tokenizer_stub.goprovides a no-op implementation when thepatternsbuild tag is disabled, so this doesn't affect builds that don't opt in.