Skip to content

Rust Core

pi22by7 edited this page Oct 26, 2025 · 1 revision

In Memoria - Rust Core Deep Dive

Welcome to the deep dive documentation for the In Memoria Rust Core. This document provides a comprehensive look into the architecture, modules, and inner workings of the high-performance analysis engine that powers In Memoria.

  • Source Location: rust-core/
  • Entry Point: rust-core/src/lib.rs

Table of Contents

  1. Core Mission
  2. Key Data Structures
  3. Module Deep Dive
  4. NAPI Bindings

1. Core Mission

The Rust core is the powerhouse of In Memoria. It is written in Rust for three primary reasons:

  • Performance: Code analysis, especially parsing large codebases and traversing ASTs, is computationally expensive. Rust provides the near-native performance required to do this quickly.
  • Memory Safety: Rust's ownership model prevents common memory-related bugs, which is crucial for a stable and reliable analysis engine.
  • Concurrency: Rust's fearless concurrency allows for future parallelization of the analysis process, enabling faster performance on multi-core systems.

The core's primary responsibility is to take raw source code and transform it into a structured, intelligent understanding of the codebase, which is then stored and served by the TypeScript application.


2. Key Data Structures

Several key structs are used throughout the Rust core to represent the code and the intelligence derived from it. These are defined in rust-core/src/types/ and rust-core/src/patterns/types.rs.

SemanticConcept

This is the fundamental unit of understanding in In Memoria. It represents a single, meaningful piece of code.

  • Source: rust-core/src/types/core_types.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
#[cfg_attr(feature = "napi-bindings", napi(object))]
pub struct SemanticConcept {
    pub id: String,
    pub name: String,
    pub concept_type: String, // e.g., 'class', 'function', 'interface'
    pub confidence: f64,
    pub file_path: String,
    pub line_range: LineRange,
    pub relationships: HashMap<String, String>, // e.g., {"calls": "other_function_id"}
    pub metadata: HashMap<String, String>, // e.g., {"async": "true"}
}

Pattern

Represents a recurring pattern discovered in the codebase.

  • Source: rust-core/src/patterns/types.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
#[cfg_attr(feature = "napi-bindings", napi(object))]
pub struct Pattern {
    pub id: String,
    pub pattern_type: String, // e.g., 'naming', 'structural', 'implementation'
    pub description: String,
    pub frequency: u32,
    pub confidence: f64,
    pub examples: Vec<PatternExample>,
    pub contexts: Vec<String>, // e.g., 'typescript', 'function'
}

CodebaseAnalysisResult

This struct holds the complete result of a high-level codebase analysis.

  • Source: rust-core/src/types/core_types.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
#[cfg_attr(feature = "napi-bindings", napi(object))]
pub struct CodebaseAnalysisResult {
    pub languages: Vec<String>,
    pub frameworks: Vec<String>,
    pub complexity: ComplexityMetrics,
    pub concepts: Vec<SemanticConcept>,
}

3. Module Deep Dive

Parsing (rust-core/src/parsing/)

This module is the first step in the analysis pipeline, responsible for converting raw source code text into a structured AST.

manager.rs - ParserManager

The ParserManager is the central hub for all parsing activities. It holds a collection of tree-sitter parsers, one for each supported language.

  • Initialization: The new() constructor calls initialize_parsers(), which sets up parsers for TypeScript, JavaScript, Rust, Python, SQL, Go, Java, C, C++, C#, and Svelte.

    // From rust-core/src/parsing/manager.rs
    fn initialize_parsers(&mut self) -> Result<(), ParseError> {
        // TypeScript parser
        let mut ts_parser = Parser::new();
        ts_parser.set_language(&tree_sitter_typescript.into()).map_err(...)?.unwrap();
        self.parsers.insert("typescript".to_string(), ts_parser);
    
        // ... and so on for other languages
        Ok(())
    }
  • Parsing: The parse method takes source code and a language identifier, and returns a tree-sitter Tree.

    // From rust-core/src/parsing/manager.rs
    pub fn parse(&mut self, code: &str, language: &str) -> Result<Tree, ParseError> {
        let parser = self.parsers.get_mut(language).ok_or_else(...)?.unwrap();
        parser.parse(code, None).ok_or_else(...)
    }

tree_walker.rs - TreeWalker

This is a generic utility that performs a depth-first traversal of a tree-sitter AST. It allows other modules, like the extractors, to apply logic to every node in the tree without duplicating traversal code.

// From rust-core/src/parsing/tree_walker.rs
pub fn walk<F>(&self, node: Node<'_>, visitor: &mut F) -> Result<(), String>
where
    F: FnMut(Node<'_>) -> Result<(), String>,
{
    self.walk_recursive(node, visitor, 0)
}

fallback.rs - FallbackExtractor

As a safety net, if tree-sitter fails to parse a file (due to syntax errors or unsupported language features), the FallbackExtractor uses a set of regular expressions to perform a basic, line-by-line analysis to extract at least some concepts.

// From rust-core/src/parsing/fallback.rs
fn extract_function_name(&self, line: &str) -> Option<String> {
    // TypeScript/JavaScript function patterns
    if line.contains("function ") { ... }

    // Arrow function patterns: const funcName = () =>
    if line.contains("=>") { ... }

    // Rust function patterns
    if line.contains("fn ") { ... }

    // Python function patterns
    if line.trim_start().starts_with("def ") { ... }

    None
}

Extractors (rust-core/src/extractors/)

Extractors are responsible for traversing the AST produced by the parser and converting language-specific nodes into the generic SemanticConcept struct. Each supported language has its own extractor module.

Example: typescript.rs

The TypeScriptExtractor looks for nodes specific to TypeScript/JavaScript syntax.

// From rust-core/src/extractors/typescript.rs
pub fn extract_concepts(...) -> Result<(), ParseError> {
    match node.kind() {
        "class_declaration" | "interface_declaration" | "type_alias_declaration" => {
            // ... extract class/interface/type concept
        }
        "function_declaration" | "method_definition" | "arrow_function" => {
            // ... extract function concept
        }
        "variable_declaration" | "lexical_declaration" => {
            // ... extract variable concept
        }
        // ... and so on
    }
    Ok(())
}

Analysis (rust-core/src/analysis/)

This module performs the higher-level analysis on the collection of SemanticConcepts extracted from the ASTs.

semantic.rs - SemanticAnalyzer

This is the main orchestrator for analysis. Its analyze_codebase method is a key entry point that:

  1. Detects languages and frameworks.
  2. Calls the appropriate extractors to get all semantic concepts.
  3. Invokes the ComplexityAnalyzer to calculate metrics.
  4. Returns a CodebaseAnalysisResult.

blueprint.rs - BlueprintAnalyzer

This analyzer is responsible for creating a quick, high-level overview of a project.

  • detect_entry_points: Scans for common entry point files like index.ts, main.py, App.jsx, etc., to understand how the project starts.
  • map_key_directories: Looks for conventional directory names like src/components, src/services, src/api, src/auth to understand the project's structure.
  • build_feature_map: Maps directories to high-level features (e.g., the auth directory is mapped to the "authentication" feature).

frameworks.rs - FrameworkDetector

This component detects the frameworks and libraries used in a project by scanning dependency files.

// From rust-core/src/analysis/frameworks.rs
fn check_package_files(...) -> Result<(), ParseError> {
    let package_files = [
        "package.json",
        "Cargo.toml",
        "requirements.txt",
        "pom.xml",
        "go.mod",
        // ...
    ];
    // ... walk directories and analyze these files
}

Patterns (rust-core/src/patterns/)

This is the most advanced module in the Rust core, responsible for machine learning-style pattern discovery.

learning.rs - PatternLearningEngine

This is the central engine that drives all pattern learning. Its learn_from_codebase method is a multi-phase process:

  1. Collects all SemanticConcepts.
  2. Invokes the NamingPatternAnalyzer.
  3. Invokes the StructuralPatternAnalyzer.
  4. Invokes the ImplementationPatternAnalyzer.
  5. Consolidates and validates the discovered patterns.

naming.rs - NamingPatternAnalyzer

Learns the dominant naming conventions in a codebase.

  • It defines a set of NamingRules with regex patterns for different conventions (e.g., ^[a-z][a-zA-Z0-9]*$ for camelCase).
  • It analyzes the names of functions, classes, and variables against these rules to determine the project's preferred style (e.g., "functions are camelCase", "classes are PascalCase").

structural.rs - StructuralPatternAnalyzer

Analyzes the high-level structure and architecture of the codebase.

  • It defines ArchitectureSignatures for common patterns like MVC, Clean Architecture, and Layered Architecture.
  • It checks for these signatures by analyzing directory structures (e.g., presence of models/, views/, controllers/) and file naming patterns.
  • It also detects structural anti-patterns like God Objects (classes with too many methods) and Circular Dependencies.

implementation.rs - ImplementationPatternAnalyzer

This analyzer detects common software design patterns.

  • It defines PatternSignatures for patterns like Singleton, Factory, Observer, and Strategy.
  • A signature consists of required methods (e.g., getInstance for Singleton), class characteristics, and common code patterns (checked via regex).
  • It analyzes the SemanticConcepts to see which ones match these signatures.

prediction.rs - ApproachPredictor

This component uses all the learned patterns to make intelligent predictions.

  • It maintains a set of ApproachTemplates (e.g., "Microservices Architecture", "Modular Monolith").
  • When given a problem description, it calculates a confidence score for each template based on the problem's complexity and how well the template's required patterns match the patterns already learned from the codebase.
  • This allows it to suggest an approach that is consistent with the project's existing style.

4. NAPI Bindings

The entire Rust core is exposed to the TypeScript application via napi-rs. The #[napi] attribute is used on structs and impl blocks to automatically generate the necessary JavaScript bindings.

// From rust-core/src/analysis/semantic.rs
#[cfg_attr(feature = "napi-bindings", napi)]
impl SemanticAnalyzer {
    #[cfg_attr(feature = "napi-bindings", napi(constructor))]
    pub fn new() -> Result<Self, ParseError> { ... }

    #[cfg_attr(feature = "napi-bindings", napi)]
    pub async unsafe fn analyze_codebase(...) -> Result<CodebaseAnalysisResult, ParseError> { ... }
}

This allows the TypeScript code in src/engines/semantic-engine.ts to import and use the SemanticAnalyzer as if it were a native TypeScript class, while all the heavy computation happens in Rust.

Clone this wiki locally