Skip to content

phmoraesrodrigues/chatbot

Repository files navigation

Regulation Document Chatbot

Setup

Prerequisites

  • Node.js 20+
  • Ollama installed locally

Install dependencies

npm install

Configure environment

Create .env.local based on .env.example:

cp .env.example .env.local

Pull a local model for Ollama:

ollama pull llama3

How to run

1) Ingest documents

This parses the HTML/PDF files in docs/, chunks them, generates embeddings, and writes data/knowledge-base.json.

npm run ingest

2) Start the app

npm run dev

Then open http://localhost:3000 and ask questions about the documents.

Approach and architecture

  • Ingestion: scripts/ingest.ts parses the HTML/PDFs, normalizes text, chunks it, and generates embeddings with @xenova/transformers. Output is stored locally as a JSON knowledge base.
  • Retrieval: src/lib/retrieval.ts loads the knowledge base, embeds the user query, computes cosine similarity, and returns top-K chunks with source metadata.
  • LLM: src/lib/ollama.ts calls a local Ollama model with the retrieved context.
  • API: /api/chat wires retrieval + Ollama and returns { answer, sources }.
  • UI: A simple chat UI renders messages and source citations.

Notes

  • The documents live in docs/ and are the only supported formats (HTML + PDF).
  • Source attribution is derived from document metadata (doc name, section, page).

About

Simple chatbot

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published