-
-
Notifications
You must be signed in to change notification settings - Fork 26
Rust Core
Welcome to the deep dive documentation for the In Memoria Rust Core. This document provides a comprehensive look into the architecture, modules, and inner workings of the high-performance analysis engine that powers In Memoria.
-
Source Location:
rust-core/ -
Entry Point:
rust-core/src/lib.rs
The Rust core is the powerhouse of In Memoria. It is written in Rust for three primary reasons:
- Performance: Code analysis, especially parsing large codebases and traversing ASTs, is computationally expensive. Rust provides the near-native performance required to do this quickly.
- Memory Safety: Rust's ownership model prevents common memory-related bugs, which is crucial for a stable and reliable analysis engine.
- Concurrency: Rust's fearless concurrency allows for future parallelization of the analysis process, enabling faster performance on multi-core systems.
The core's primary responsibility is to take raw source code and transform it into a structured, intelligent understanding of the codebase, which is then stored and served by the TypeScript application.
Several key structs are used throughout the Rust core to represent the code and the intelligence derived from it. These are defined in rust-core/src/types/ and rust-core/src/patterns/types.rs.
This is the fundamental unit of understanding in In Memoria. It represents a single, meaningful piece of code.
-
Source:
rust-core/src/types/core_types.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
#[cfg_attr(feature = "napi-bindings", napi(object))]
pub struct SemanticConcept {
pub id: String,
pub name: String,
pub concept_type: String, // e.g., 'class', 'function', 'interface'
pub confidence: f64,
pub file_path: String,
pub line_range: LineRange,
pub relationships: HashMap<String, String>, // e.g., {"calls": "other_function_id"}
pub metadata: HashMap<String, String>, // e.g., {"async": "true"}
}Represents a recurring pattern discovered in the codebase.
-
Source:
rust-core/src/patterns/types.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
#[cfg_attr(feature = "napi-bindings", napi(object))]
pub struct Pattern {
pub id: String,
pub pattern_type: String, // e.g., 'naming', 'structural', 'implementation'
pub description: String,
pub frequency: u32,
pub confidence: f64,
pub examples: Vec<PatternExample>,
pub contexts: Vec<String>, // e.g., 'typescript', 'function'
}This struct holds the complete result of a high-level codebase analysis.
-
Source:
rust-core/src/types/core_types.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
#[cfg_attr(feature = "napi-bindings", napi(object))]
pub struct CodebaseAnalysisResult {
pub languages: Vec<String>,
pub frameworks: Vec<String>,
pub complexity: ComplexityMetrics,
pub concepts: Vec<SemanticConcept>,
}This module is the first step in the analysis pipeline, responsible for converting raw source code text into a structured AST.
The ParserManager is the central hub for all parsing activities. It holds a collection of tree-sitter parsers, one for each supported language.
-
Initialization: The
new()constructor callsinitialize_parsers(), which sets up parsers for TypeScript, JavaScript, Rust, Python, SQL, Go, Java, C, C++, C#, and Svelte.// From rust-core/src/parsing/manager.rs fn initialize_parsers(&mut self) -> Result<(), ParseError> { // TypeScript parser let mut ts_parser = Parser::new(); ts_parser.set_language(&tree_sitter_typescript.into()).map_err(...)?.unwrap(); self.parsers.insert("typescript".to_string(), ts_parser); // ... and so on for other languages Ok(()) }
-
Parsing: The
parsemethod takes source code and a language identifier, and returns atree-sitterTree.// From rust-core/src/parsing/manager.rs pub fn parse(&mut self, code: &str, language: &str) -> Result<Tree, ParseError> { let parser = self.parsers.get_mut(language).ok_or_else(...)?.unwrap(); parser.parse(code, None).ok_or_else(...) }
This is a generic utility that performs a depth-first traversal of a tree-sitter AST. It allows other modules, like the extractors, to apply logic to every node in the tree without duplicating traversal code.
// From rust-core/src/parsing/tree_walker.rs
pub fn walk<F>(&self, node: Node<'_>, visitor: &mut F) -> Result<(), String>
where
F: FnMut(Node<'_>) -> Result<(), String>,
{
self.walk_recursive(node, visitor, 0)
}As a safety net, if tree-sitter fails to parse a file (due to syntax errors or unsupported language features), the FallbackExtractor uses a set of regular expressions to perform a basic, line-by-line analysis to extract at least some concepts.
// From rust-core/src/parsing/fallback.rs
fn extract_function_name(&self, line: &str) -> Option<String> {
// TypeScript/JavaScript function patterns
if line.contains("function ") { ... }
// Arrow function patterns: const funcName = () =>
if line.contains("=>") { ... }
// Rust function patterns
if line.contains("fn ") { ... }
// Python function patterns
if line.trim_start().starts_with("def ") { ... }
None
}Extractors are responsible for traversing the AST produced by the parser and converting language-specific nodes into the generic SemanticConcept struct. Each supported language has its own extractor module.
The TypeScriptExtractor looks for nodes specific to TypeScript/JavaScript syntax.
// From rust-core/src/extractors/typescript.rs
pub fn extract_concepts(...) -> Result<(), ParseError> {
match node.kind() {
"class_declaration" | "interface_declaration" | "type_alias_declaration" => {
// ... extract class/interface/type concept
}
"function_declaration" | "method_definition" | "arrow_function" => {
// ... extract function concept
}
"variable_declaration" | "lexical_declaration" => {
// ... extract variable concept
}
// ... and so on
}
Ok(())
}This module performs the higher-level analysis on the collection of SemanticConcepts extracted from the ASTs.
This is the main orchestrator for analysis. Its analyze_codebase method is a key entry point that:
- Detects languages and frameworks.
- Calls the appropriate extractors to get all semantic concepts.
- Invokes the
ComplexityAnalyzerto calculate metrics. - Returns a
CodebaseAnalysisResult.
This analyzer is responsible for creating a quick, high-level overview of a project.
-
detect_entry_points: Scans for common entry point files likeindex.ts,main.py,App.jsx, etc., to understand how the project starts. -
map_key_directories: Looks for conventional directory names likesrc/components,src/services,src/api,src/authto understand the project's structure. -
build_feature_map: Maps directories to high-level features (e.g., theauthdirectory is mapped to the "authentication" feature).
This component detects the frameworks and libraries used in a project by scanning dependency files.
// From rust-core/src/analysis/frameworks.rs
fn check_package_files(...) -> Result<(), ParseError> {
let package_files = [
"package.json",
"Cargo.toml",
"requirements.txt",
"pom.xml",
"go.mod",
// ...
];
// ... walk directories and analyze these files
}This is the most advanced module in the Rust core, responsible for machine learning-style pattern discovery.
This is the central engine that drives all pattern learning. Its learn_from_codebase method is a multi-phase process:
- Collects all
SemanticConcepts. - Invokes the
NamingPatternAnalyzer. - Invokes the
StructuralPatternAnalyzer. - Invokes the
ImplementationPatternAnalyzer. - Consolidates and validates the discovered patterns.
Learns the dominant naming conventions in a codebase.
- It defines a set of
NamingRules with regex patterns for different conventions (e.g.,^[a-z][a-zA-Z0-9]*$forcamelCase). - It analyzes the names of functions, classes, and variables against these rules to determine the project's preferred style (e.g., "functions are
camelCase", "classes arePascalCase").
Analyzes the high-level structure and architecture of the codebase.
- It defines
ArchitectureSignatures for common patterns like MVC, Clean Architecture, and Layered Architecture. - It checks for these signatures by analyzing directory structures (e.g., presence of
models/,views/,controllers/) and file naming patterns. - It also detects structural anti-patterns like God Objects (classes with too many methods) and Circular Dependencies.
This analyzer detects common software design patterns.
- It defines
PatternSignatures for patterns like Singleton, Factory, Observer, and Strategy. - A signature consists of required methods (e.g.,
getInstancefor Singleton), class characteristics, and common code patterns (checked via regex). - It analyzes the
SemanticConcepts to see which ones match these signatures.
This component uses all the learned patterns to make intelligent predictions.
- It maintains a set of
ApproachTemplates (e.g., "Microservices Architecture", "Modular Monolith"). - When given a problem description, it calculates a confidence score for each template based on the problem's complexity and how well the template's required patterns match the patterns already learned from the codebase.
- This allows it to suggest an approach that is consistent with the project's existing style.
The entire Rust core is exposed to the TypeScript application via napi-rs. The #[napi] attribute is used on structs and impl blocks to automatically generate the necessary JavaScript bindings.
// From rust-core/src/analysis/semantic.rs
#[cfg_attr(feature = "napi-bindings", napi)]
impl SemanticAnalyzer {
#[cfg_attr(feature = "napi-bindings", napi(constructor))]
pub fn new() -> Result<Self, ParseError> { ... }
#[cfg_attr(feature = "napi-bindings", napi)]
pub async unsafe fn analyze_codebase(...) -> Result<CodebaseAnalysisResult, ParseError> { ... }
}This allows the TypeScript code in src/engines/semantic-engine.ts to import and use the SemanticAnalyzer as if it were a native TypeScript class, while all the heavy computation happens in Rust.