Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
e11a209
add kgrag layer 4
Feb 2, 2026
7dfd58d
fix some formatting issue
Feb 2, 2026
ec5e2d6
Layer 3 of knowledge graph
Mar 5, 2026
bd107db
layer 2 of the project
Mar 5, 2026
bebc66b
layer 1 of kgrag
Mar 5, 2026
4e30946
Fill in the most important functions of KGRag
Mar 8, 2026
36457f3
Fill in the core missing functions
Mar 8, 2026
411ba59
Finished the less important functions
Mar 8, 2026
1ebf39c
Adding testing files
Mar 8, 2026
44825be
Fix issues found in the testing
Mar 8, 2026
42964ec
Implement Phase 2: 8 runnable scripts for KG-RAG pipeline
Mar 8, 2026
6e119ba
Extract common functions from the scripts
Mar 8, 2026
35919aa
Add test function for the utility files
Mar 8, 2026
aa4fa5d
Add the final run.sh script so that one can call it directly.
Mar 9, 2026
7be0ce1
Add two read me files for the library and the example
Mar 9, 2026
4e994c9
fix the issue so that run.sh is runable. However further update are e…
Mar 10, 2026
3cf11b4
Rewrite preprocessing and embedding scripts for mellea-style pipeline
Mar 11, 2026
b130db0
Enhance run_kg_update.py to match mellea's architecture and functiona…
Mar 11, 2026
eb017a3
Add comprehensive KG-RAG pipeline architecture documentation
Mar 11, 2026
60073db
Add run_kg_update.py as Step 3 to run.sh pipeline
Mar 11, 2026
d8a603d
Enable run_kg_update.py to load LLM/Neo4j config from .env file
Mar 11, 2026
e40da15
Fix run_kg_update.py to properly use RITS model from environment
Mar 11, 2026
272de0d
Update KG-RAG example README with three-stage pipeline documentation
Mar 11, 2026
2b36a1e
Remove large data files from git tracking (keep locally)
Mar 11, 2026
fca3a15
Add dataset directory documentation for large data files
Mar 11, 2026
523b244
update the run.sh for step 0
Mar 12, 2026
6e08755
Fix the run_kg_update and run_qa
Mar 16, 2026
6c70efe
fix the run_eval.py and readme
Mar 16, 2026
fe0f99e
remove some intermediate files
Mar 16, 2026
e025a6e
remove the data from git
Mar 16, 2026
2ad1b85
Refactor the scripts to better utilize the library
Mar 18, 2026
8d75eab
Address some of the issue from comments
Mar 18, 2026
427f644
Additional changes found during review
Mar 18, 2026
f3af8ba
Address additional issues of the comments from the PR
Mar 18, 2026
7a15435
Fix the embeder after refactoring
Mar 18, 2026
6125e56
Update embedder
Mar 18, 2026
e5c2b16
Another round of update
Mar 19, 2026
89b1132
make layer 1 only talk to layer 4 via layer 3
Mar 26, 2026
670c25e
Clean readme and comments
Mar 26, 2026
932b75f
Update the two readme
Mar 26, 2026
941275a
Deep intergrate with the predefined classes
Mar 26, 2026
16c487e
Speed up the embedding step
Mar 26, 2026
6428ed8
Address some of the comment left in the PR.
Mar 26, 2026
b18be73
Refactor code to clean further
Mar 26, 2026
a714442
chore: remove CLAUDE.md from repository
Mar 26, 2026
7275226
refactor(test): rename test files to reflect correct layer numbers an…
Mar 26, 2026
49e6d3f
Merge branch 'main' into yzhu/missing_components
ydzhu98 Mar 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -443,3 +443,11 @@ pyrightconfig.json
.ionide

# End of https://www.toptal.com/developers/gitignore/api/python,direnv,visualstudiocode,pycharm,macos,jetbrains

# KG-RAG Example Data (Large Files - Keep Locally)
docs/examples/kgrag/.env
docs/examples/kgrag/dataset/*.jsonl.bz2
docs/examples/kgrag/dataset/movie/*.json
docs/examples/kgrag/dataset/movie/*.bz2
docs/examples/kgrag/output/
docs/examples/kgrag/data/
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
<img src="https://github.com/generative-computing/mellea-contribs/raw/main/mellea-contribs.jpg" height=100>


# Mellea Contribs

The `mellea-contribs` repository is an incubation point for contributions to
Expand Down
52 changes: 52 additions & 0 deletions docs/examples/kgrag/.env_template
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Graph Database Configuration (Neo4j default; adjust for other graph DBs)
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=password

# Data Directory
KG_BASE_DIRECTORY=./dataset
DATA_PATH=./data

# ---------------------------------------------------------------------------
# Primary LLM — any OpenAI-compatible endpoint
# ---------------------------------------------------------------------------
# Option A: OpenAI
# API_KEY=sk-...
# MODEL_NAME=gpt-4o-mini

# Option B: Local Ollama
# API_BASE=http://localhost:11434/v1
# API_KEY=ollama
# MODEL_NAME=llama3.2

# Option C: vLLM / self-hosted
# API_BASE=http://localhost:8000/v1
# API_KEY=dummy
# MODEL_NAME=meta-llama/Llama-3.1-8B-Instruct

# Option D: Azure OpenAI
# API_BASE=https://<your-resource>.openai.azure.com/openai/deployments/<deployment>/
# API_KEY=<azure-key>
# MODEL_NAME=gpt-4o-mini

# ---------------------------------------------------------------------------
# Optional: Separate evaluation model (defaults to primary LLM if unset)
# ---------------------------------------------------------------------------
# EVAL_API_BASE=...
# EVAL_API_KEY=...
# EVAL_MODEL_NAME=...

# ---------------------------------------------------------------------------
# Optional: Embedding model for vector entity alignment
# ---------------------------------------------------------------------------
# EMB_API_BASE=http://localhost:11434/v1
# EMB_API_KEY=ollama
# EMB_MODEL_NAME=nomic-embed-text
# VECTOR_DIMENSIONS=768

# Request Configuration
MAX_RETRIES=3
TIME_OUT=1800

# OpenTelemetry — disable if you don't have a collector running
OTEL_SDK_DISABLED=true
Loading