Skip to content

Conversation

@wassfila
Copy link
Member

No description provided.

@wassfila wassfila requested a review from Copilot November 16, 2025 17:01
Copilot finished reviewing on behalf of wassfila November 16, 2025 17:02
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR migrates the content-structure library from JSON-based storage to SQLite database storage, introducing better-sqlite3 for structured data persistence alongside the existing JSON outputs.

Key changes:

  • Replaces glob batch collection with streaming via globStream for memory efficiency
  • Adds SQLite database generation with normalized tables for documents, assets, blobs, references, images, code blocks, tables, and paragraphs
  • Implements blob storage with SHA-512 hashing and deduplication

Reviewed Changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/structure_db.js Core SQLite database writer implementing schema loading and table creation from catalog.yaml
src/sqlite_utils/index.js Database utility layer providing connection caching, table management, and bulk insert operations
src/blob_manager.js Blob storage manager handling SHA-512 hashing and deduplicated file persistence
src/collect.js Refactored to use streaming file collection with globStream instead of batch processing
index.js Main collection logic rewritten to build SQLite database with streaming document processing
src/md_utils.js Updated asset info builders to include blob content and parent document references
catalog.yaml Schema definition for SQLite database structure with table and column specifications
src/utils.js Added load_yaml_code function for loading YAML from code directory
package.json Added better-sqlite3 dependency and updated demo script
pnpm-lock.yaml Updated lockfile with better-sqlite3 and dependency version bumps
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

streamline asset extraction and processing in document handling
… UID generation

- Bump version from 1.1.10 to 2.0.0 in package.json
- Update dependencies: glob (10.3.10 to 13.0.0), js-yaml (4.1.0 to 4.1.1), remark-directive (3.0.0 to 4.0.0), remark-gfm (4.0.0 to 4.0.1), unified (11.0.4 to 11.0.5)
- Remove unused item UID generation in structure_db.js
- Clean up pnpm-lock.yaml to reflect updated dependencies and remove deprecated packages
@wassfila wassfila merged commit 4e3fba6 into main Nov 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants