Curiosity

Our civilization is built on curiosity. Curiosity recommender system's object is suggesting perfect list after reading documents.

Processing

Notion.so raw data generation
Nosion.so raw data to markdown

1~2 processings are done by texonom/notion-node

Markdown to Huggingface dataset

git clone https://github.com/texonom/texonom-md
python hf_upload.py chroma

Extracted dataset to embedding

Run chroma server

pm2 start conf/chroma.json

Run embedding server

volume=data
model=thenlper/gte-small
docker run -d --name tei --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:0.3.0 --model-id $model

python index_to.py pgvector --pgstring <PGSTRING>
# or for local onnx inference
python index_to.py pgvector --pgstring <PGSTRING> --local

Use embedding for recommendation

Plan

from dictionary dataset without id duplicating (prefer recent one)
dataset tagging with date

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
conf		conf
curiosity		curiosity
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
index_to.py		index_to.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Curiosity

Processing

Plan

About

Uh oh!

Releases

Packages

Uh oh!

Languages

texonom/curiosity

Folders and files

Latest commit

History

Repository files navigation

Curiosity

Processing

Plan

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages