opsidian_core

Shared core library for Obsidian knowledge graph generators. Provides generic caching, topic categorization, template rendering, vault writing, and sync state management that works with any data source.

Overview

opsidian_core is the foundation layer shared by:

opsidian_graph — work knowledge graph from GitHub PRs, JIRA issues, and Confluence pages
drive_og — personal knowledge graph from Google Drive documents
opsidian_meta — unified productivity analysis vault that reads both caches and generates timeline views, focus reports, and knowledge-gap detection

All projects produce Obsidian vaults with interconnected markdown notes, wikilinks, and topic-based organization. This library provides the common machinery they share.

Architecture

                    opsidian_core (shared library)
                ┌──────────────────────────────────┐
                │  models.py      BaseDocument      │
                │  cache.py       JSON persistence   │
                │  categorizer.py keyword + LLM      │
                │  synthesizer.py weekly summaries   │
                │  note_generator.py Jinja2 engine   │
                │  vault_writer.py markdown output   │
                │  state_manager.py incremental sync │
                │  config.py      topic YAML loading │
                │  templates/     shared .j2 files   │
                └──────────┬───────────┬────────────┘
                           │           │
              ┌────────────┘           └────────────┐
              v                                     v
    opsidian_graph                             drive_og
    (GitHub/JIRA/Confluence)              (Google Drive)

Installation

pip install -e .

Key Concepts

BaseDocument Protocol

All data sources must produce objects compatible with BaseDocument. This uses duck-typing — no inheritance required. Any dataclass with these fields works:

@dataclass
class BaseDocument:
    id: str              # Unique identifier
    title: str           # Document title
    body_text: str       # Extracted text content
    source_type: str     # "github", "jira", "confluence", "gdrive", "desktop"
    source_group: str    # Grouping key (repo name, folder path, project key)
    created_at: str      # ISO 8601 timestamp
    updated_at: str      # ISO 8601 timestamp
    url: str             # Web URL to original document
    labels: list[str]    # Tags/labels from the source

    @property
    def date(self) -> str:  # YYYY-MM-DD from updated_at

To check compatibility at runtime:

from opsidian_core import is_base_document

assert is_base_document(my_custom_doc)  # True if all fields present

Topic Categorization

Documents are categorized into topics using a priority stack:

Keyword matching — regex patterns from config/topics.yaml
LLM fallback — Claude Haiku classifies unmatched documents (requires ANTHROPIC_API_KEY)
Default — "Uncategorized" if both fail

Documents can have multiple topics (multi-label).

Caching

Generic JSON cache that serializes any dataclass via dataclasses.asdict():

from opsidian_core import save_all_items, load_items

save_all_items(cache_dir, "_gdrive", docs)   # Write to cache/_gdrive/<group>/<id>.json
items = load_items(cache_dir, "_gdrive")      # Returns list[dict]

API Reference

Models

BaseDocument — common document dataclass
is_base_document(obj) — duck-typing check

Cache

save_item(cache_dir, source_prefix, group, item) — save one item
save_all_items(cache_dir, source_prefix, items) — save list, grouped by source_group
load_items(cache_dir, source_prefix) — load all items as dicts
delete_item(cache_dir, source_prefix, group, item_id) — remove one item

Categorizer

TopicRule(name, patterns) — topic definition with compiled regex
extract_topics_from_text(text, rules) — keyword matching only
categorize_document(doc, rules, *, use_llm=True) — full pipeline (keywords + LLM + default)

Config

load_topics(config_dir) — parse config/topics.yaml into list[TopicRule]

Note Generator

sanitize_filename(name) — clean string for use as filename
week_id(date_str) — "2024-01-15" → "2024-W03"
month_id(date_str) — "2024-01-15" → "2024-01"
create_template_env(*template_dirs) — Jinja2 environment with ChoiceLoader

State Manager

SyncState(last_sync, sources) — incremental sync state
load_state(path) / save_state(path, state) — JSON persistence

Synthesizer

generate_weekly_summary(week_id, items) — LLM-powered summary (returns None if no API key)

Vault Writer

write_note(vault_path, rel_path, content) — write single markdown note
write_all_notes(vault_path, notes) — batch write, returns count

Shared Templates

Located in templates/:

Template	Purpose
`topic_moc.md.j2`	Cross-source topic Map of Content
`weekly_note.md.j2`	Weekly activity summary
`monthly_note.md.j2`	Monthly rollup
`expertise.md.j2`	Auto-generated expertise profile
`index.md.j2`	Dashboard / index page

Configuration

`config/topics.yaml`

topics:
  Machine Learning:
    - "\\bml\\b"
    - "\\bdeep.learning"
    - "\\bneural.network"
  CI/CD:
    - "\\bci\\b"
    - "\\bworkflow\\b"

Patterns are case-insensitive regex. Each topic can have multiple patterns.

Adding a New Data Source

To create a new project that uses opsidian_core:

Define a dataclass with BaseDocument-compatible fields (use properties for mapping)
Write a client that fetches data from your source
Use save_all_items / load_items for caching
Use categorize_document for topic extraction
Create source-specific Jinja2 templates
Use create_template_env to merge core + source templates
Use write_all_notes to output the vault

See drive_og for a complete example.

Tests

python -m pytest tests/ -v

19 tests covering all modules.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
src/opsidian_core		src/opsidian_core
templates		templates
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

opsidian_core

Overview

Architecture

Installation

Key Concepts

BaseDocument Protocol

Topic Categorization

Caching

API Reference

Models

Cache

Categorizer

Config

Note Generator

State Manager

Synthesizer

Vault Writer

Shared Templates

Configuration

`config/topics.yaml`

Adding a New Data Source

Tests

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

opsidian_core

Overview

Architecture

Installation

Key Concepts

BaseDocument Protocol

Topic Categorization

Caching

API Reference

Models

Cache

Categorizer

Config

Note Generator

State Manager

Synthesizer

Vault Writer

Shared Templates

Configuration

config/topics.yaml

Adding a New Data Source

Tests

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`config/topics.yaml`

Packages