Shared core library for Obsidian knowledge graph generators. Provides generic caching, topic categorization, template rendering, vault writing, and sync state management that works with any data source.
opsidian_core is the foundation layer shared by:
- opsidian_graph — work knowledge graph from GitHub PRs, JIRA issues, and Confluence pages
- drive_og — personal knowledge graph from Google Drive documents
- opsidian_meta — unified productivity analysis vault that reads both caches and generates timeline views, focus reports, and knowledge-gap detection
All projects produce Obsidian vaults with interconnected markdown notes, wikilinks, and topic-based organization. This library provides the common machinery they share.
opsidian_core (shared library)
┌──────────────────────────────────┐
│ models.py BaseDocument │
│ cache.py JSON persistence │
│ categorizer.py keyword + LLM │
│ synthesizer.py weekly summaries │
│ note_generator.py Jinja2 engine │
│ vault_writer.py markdown output │
│ state_manager.py incremental sync │
│ config.py topic YAML loading │
│ templates/ shared .j2 files │
└──────────┬───────────┬────────────┘
│ │
┌────────────┘ └────────────┐
v v
opsidian_graph drive_og
(GitHub/JIRA/Confluence) (Google Drive)
pip install -e .All data sources must produce objects compatible with BaseDocument. This uses duck-typing — no inheritance required. Any dataclass with these fields works:
@dataclass
class BaseDocument:
id: str # Unique identifier
title: str # Document title
body_text: str # Extracted text content
source_type: str # "github", "jira", "confluence", "gdrive", "desktop"
source_group: str # Grouping key (repo name, folder path, project key)
created_at: str # ISO 8601 timestamp
updated_at: str # ISO 8601 timestamp
url: str # Web URL to original document
labels: list[str] # Tags/labels from the source
@property
def date(self) -> str: # YYYY-MM-DD from updated_atTo check compatibility at runtime:
from opsidian_core import is_base_document
assert is_base_document(my_custom_doc) # True if all fields presentDocuments are categorized into topics using a priority stack:
- Keyword matching — regex patterns from
config/topics.yaml - LLM fallback — Claude Haiku classifies unmatched documents (requires
ANTHROPIC_API_KEY) - Default —
"Uncategorized"if both fail
Documents can have multiple topics (multi-label).
Generic JSON cache that serializes any dataclass via dataclasses.asdict():
from opsidian_core import save_all_items, load_items
save_all_items(cache_dir, "_gdrive", docs) # Write to cache/_gdrive/<group>/<id>.json
items = load_items(cache_dir, "_gdrive") # Returns list[dict]BaseDocument— common document dataclassis_base_document(obj)— duck-typing check
save_item(cache_dir, source_prefix, group, item)— save one itemsave_all_items(cache_dir, source_prefix, items)— save list, grouped bysource_groupload_items(cache_dir, source_prefix)— load all items as dictsdelete_item(cache_dir, source_prefix, group, item_id)— remove one item
TopicRule(name, patterns)— topic definition with compiled regexextract_topics_from_text(text, rules)— keyword matching onlycategorize_document(doc, rules, *, use_llm=True)— full pipeline (keywords + LLM + default)
load_topics(config_dir)— parseconfig/topics.yamlintolist[TopicRule]
sanitize_filename(name)— clean string for use as filenameweek_id(date_str)—"2024-01-15"→"2024-W03"month_id(date_str)—"2024-01-15"→"2024-01"create_template_env(*template_dirs)— Jinja2 environment withChoiceLoader
SyncState(last_sync, sources)— incremental sync stateload_state(path)/save_state(path, state)— JSON persistence
generate_weekly_summary(week_id, items)— LLM-powered summary (returnsNoneif no API key)
write_note(vault_path, rel_path, content)— write single markdown notewrite_all_notes(vault_path, notes)— batch write, returns count
Located in templates/:
| Template | Purpose |
|---|---|
topic_moc.md.j2 |
Cross-source topic Map of Content |
weekly_note.md.j2 |
Weekly activity summary |
monthly_note.md.j2 |
Monthly rollup |
expertise.md.j2 |
Auto-generated expertise profile |
index.md.j2 |
Dashboard / index page |
topics:
Machine Learning:
- "\\bml\\b"
- "\\bdeep.learning"
- "\\bneural.network"
CI/CD:
- "\\bci\\b"
- "\\bworkflow\\b"Patterns are case-insensitive regex. Each topic can have multiple patterns.
To create a new project that uses opsidian_core:
- Define a dataclass with
BaseDocument-compatible fields (use properties for mapping) - Write a client that fetches data from your source
- Use
save_all_items/load_itemsfor caching - Use
categorize_documentfor topic extraction - Create source-specific Jinja2 templates
- Use
create_template_envto merge core + source templates - Use
write_all_notesto output the vault
See drive_og for a complete example.
python -m pytest tests/ -v19 tests covering all modules.
MIT