A self‑hosted Meilisearch powered contextual search for AI agents. Allows AI agents to search in simple terms for documentation, files, and examples that you have provided.
An end‑to‑end, container‑friendly pipeline that:
- Downloads documentation and files from multiple sources into a local
output/folder - Indexes those files in Meilisearch for fast, flexible search
- Exposes a minimal Model Context Protocol (MCP) server so AI agents can reliably query “user‑loaded” documents and fetch exact source files for grounding
- 📥 Unified downloader: Git and HTTP sources merged into a single tree (
output/) - 🔎 Meilisearch indexing with smart content handling (frontmatter Markdown, YAML/JSON/CSV as structured data)
- 🧭 Safe, explicit scope: indexes are derived from your top‑level folders; optional allow‑list restricts searches and file fetches
- 🔌 MCP server over HTTP or stdio: list indexes, search, and fetch exact files for answer grounding
- 🐳 Batteries included: one
docker compose up -d --buildruns the whole stack - 🛠️ Extensible by design: adjust loader rules, file filters, and environment without rebuilding images in most cases
Warning
This project is designed to run via Docker. Install Docker Desktop if you’re on Windows or macOS.
At minimum you must set the Meilisearch master key so dependent services can authenticate.
echo "MEILISEARCH_MASTER_KEY=$(openssl rand -hex 32)" >> .envOptional variables you can add now or later:
# Restrict which Meilisearch indexes the MCP server will expose (space/comma/newline separated)
MEILISEARCH_ALLOWED_INDEXES="docs guides examples"
# Run containers as your host user (helps with file ownership on ./output)
UID=1000
GID=1000
Edit data-sources.yml to describe what to download. Use the unified config: shape. A minimal example:
config:
sources:
- type: git
repo: https://github.com/example/docs.git
subpath: docs
ref: main
destination: docs
- type: http
url: https://example.com/guide.md
filename: guide.md
destination: examplesPer-source filtering can be applied using include/exclude on individual source entries. For example, to exclude a lockfile from a Git source:
config:
sources:
- type: git
repo: https://github.com/nitrojs/nitro.git
subpath: docs
destination: nitro
exclude:
- "pnpm-lock.yaml"See the Downloader README for the full schema and filtering rules: src/downloader_web/README.md.
config:
sources: []
loaders: []
destinations: {}
collections: {}config:
destinations:
docs:
description: |
Docs for the main project
guides:
description: |
Guides and tutorials
collections:
project:
description: |
Core project documentation
destinations:
- docs
learning:
description: |
Guides and tutorials
destinations:
- guides
loaders:
- path: guides
type: frontmatter
sources:
- type: git
repo: https://github.com/example/docs.git
subpath: docs
ref: main
destination: docs
include:
- "**/*.md"
- type: git
repo: https://github.com/example/guides.git
subpath: content
destination: guides
exclude:
- "**/pnpm-lock.yaml"
- type: http
url: https://example.com/guide.md
filename: getting-started.md
destination: guidesdocker compose up -d --buildServices and default ports:
- Meilisearch API: http://localhost:7700
- Downloader Web API: http://localhost:8080 (health, refresh)
- MCP server (HTTP): http://localhost:8000
First run will download sources, write into ./output, index them, and then expose them via the MCP server.
Tip
If you edit data-sources.yml, you can refresh downloads without restarting:
curl -X POST http://localhost:8080/refreshFetches files from the internet (HTTP, Git, etc.) into ./output.
Indexes files from ./output into Meilisearch.
Exposes your indexes to AI tooling via MCP.
Compose file: docker-compose.yml ties everything together.
- downloader_web populates
./outputfrom your configured sources - file_loader performs an initial full index into Meilisearch, then watches for changes
- mcp_server lists/searches those indexes and can fetch the exact file content under
./output
- Refresh downloads after changing
data-sources.yml:
curl -X POST http://localhost:8080/refresh- Restart services if you change environment variables:
docker compose restart- Meilisearch not healthy:
docker compose logs -f meilisearch - Downloader not ready (
/health503): checkdocker compose logs -f downloader_web - Files not indexed: verify file extensions/size limits in file_loader README and that files live under a top‑level folder in
./output - MCP search shows no indexes: confirm
MEILISEARCH_ALLOWED_INDEXES(if set) and that file_loader created indexes in Meilisearch - File fetch denied from MCP: path traversal is blocked and, if an allow‑list is set, only first‑segment matches are allowed (e.g.,
docs/...)