This document outlines the plan for extracting the CLI into its own repository while keeping the Discord bot functional.
Key Change: Move CLI-only libraries into src/cli/ folder BEFORE refactoring to open source alternatives. This allows immediate CLI extraction without waiting for library replacements.
These remain in shared locations and may be refactored to open source:
| Library | Current Location | Used By | Action |
|---|---|---|---|
| pino | src/logger.ts |
CLI + Bot | Refactor to open source (HIGH priority) |
| zod | src/config/cli.ts |
CLI + Bot | Refactor config validation (HIGH priority) |
These are used ONLY by the CLI and should be moved into the CLI folder structure immediately:
| Library | Current Location | CLI Only | Action |
|---|---|---|---|
| openai | src/llm/client.ts |
✅ Yes | Move to src/cli/lib/llm/ - refactor later |
| @qdrant/js-client-rest | src/vector/qdrant-store.ts |
✅ Yes | Move to src/cli/lib/vector/ - refactor later |
| pgvector | Repository layer | ✅ Yes | Move to src/cli/lib/db/ - refactor later |
| pg | src/db/client.ts |
✅ Yes | Move to src/cli/lib/db/ - refactor later |
| @extractus/article-extractor | src/ingestion/extractor.ts |
✅ Yes | Move to src/cli/lib/ingestion/ - refactor later |
Goal: Enable CLI extraction without refactoring
- Create
src/cli/lib/directory structure - Move CLI-only modules into
src/cli/lib/ - Update all imports within CLI
- Bot continues to invoke CLI via subprocess (no changes needed)
- No refactoring - keep current implementations
CLI Folder Structure After Move:
src/cli/
commands/ # CLI commands
presenters/ # Progress presenters
lib/ # NEW: CLI-only libraries
llm/
client.ts # Moved from src/llm/
vector/
qdrant-store.ts # Moved from src/vector/
db/
client.ts # Moved from src/db/
repository.ts # Moved from src/db/
queue-repository.ts # Moved from src/db/
ingestion/
service.ts # Moved from src/ingestion/
extractor.ts # Moved from src/ingestion/
youtube.ts # Moved from src/ingestion/
url.ts # Moved from src/ingestion/
config/
cli.ts # CLI config (shared with bot via subprocess)
Benefits:
- CLI can be extracted to separate repo immediately
- Bot repo becomes lightweight (only Discord code + subprocess runner)
- No shared modules to manage between repos
- No git submodules needed
Goal: Replace custom implementations with open source packages
After CLI extraction, refactor each CLI-only library:
- Replace custom LLM client with
openainpm package - Replace Qdrant wrapper with
@qdrant/js-client-rest - Replace logger with
pino(already HIGH priority) - Improve pgvector integration with
pgvectornpm package - Simplify database client
Benefits:
- Can refactor independently in CLI repo
- No impact on Bot repo
- Can take time to do each replacement properly
src/cli/- CLI source codetests/cli/- CLI testssrc/config/cli.ts- CLI configuration (shared with Bot via subprocess)
Post-Move Shared Modules:
src/config/cli.ts- CLI config schema (Bot reads, CLI uses)src/logger.ts- Logger (HIGH priority: refactor to pino)src/interfaces/cli-types.ts- Type definitionssrc/bot/cli-runner.ts- Subprocess runner (Bot only)
- Create new repo
sourcebase-cli - Copy
src/cli/(includes all CLI code + libraries) - Copy
src/config/cli.tsandsrc/interfaces/cli-types.ts(shared types) - Copy
tests/cli/ - Copy migrations schema
- Bot repo removes
src/cli/folder - Both repos have independent package.json
Result:
- CLI repo: Self-contained with all its dependencies
- Bot repo: Minimal (Discord + subprocess runner + shared logger/config)
- No shared modules or submodules needed
- Bot invokes CLI via
npm exec sourcebase-clior direct binary
- Create
src/cli/lib/directory - Move
src/llm/client.tstosrc/cli/lib/llm/ - Move
src/vector/qdrant-store.tstosrc/cli/lib/vector/ - Move
src/db/client.ts,repository.ts,queue-repository.tstosrc/cli/lib/db/ - Move
src/ingestion/service.ts,extractor.ts,youtube.ts,url.tstosrc/cli/lib/ingestion/ - Update all imports within CLI to use relative paths
- Verify CLI still builds and tests pass
- Verify Bot still works via subprocess
- Replace LLM client with
openainpm package - Replace Qdrant wrapper with
@qdrant/js-client-rest - Replace logger with
pino(HIGH priority - affects Bot) - Improve pgvector integration
- Simplify database client
src/
cli/
commands/ # CLI commands
presenters/ # Progress presenters
lib/ # CLI-only libraries
llm/
client.ts # Will be refactored to openai
vector/
qdrant-store.ts # Will be refactored to @qdrant/js-client-rest
db/
client.ts # Will be simplified
repository.ts # May keep or replace with ORM
queue-repository.ts
ingestion/
service.ts # Core orchestration (keep)
extractor.ts # May keep or replace
youtube.ts # May keep or replace
config/
cli.ts # CLI config
interfaces/
cli-types.ts # Shared types
logger.ts # Will be refactored to pino
tests/
cli/ # CLI tests
migrations/
001_initial_schema.sql
package.json # All CLI dependencies
src/
bot/ # Bot subprocess runner
cli-runner.ts # Invokes CLI as subprocess
config/
bot.ts # Bot config (extends CLI config)
cli.ts # CLI config schema (copied from CLI repo)
interfaces/
cli-types.ts # Shared types (copied from CLI repo)
logger.ts # Logger (HIGH priority: refactor to pino)
discord/ # Discord-specific code
index.ts # Bot entrypoint
tests/
bot/ # Bot tests
boundary.test.ts # Boundary enforcement
config/ # Config tests
package.json # Minimal dependencies (discord.js, pino, zod)
-
Create CLI lib directory
mkdir -p src/cli/lib/{llm,vector,db,ingestion} -
Move files (preserving git history with
git mv)git mv src/llm/client.ts src/cli/lib/llm/ git mv src/vector/qdrant-store.ts src/cli/lib/vector/ git mv src/db/client.ts src/cli/lib/db/ git mv src/db/repository.ts src/cli/lib/db/ git mv src/db/queue-repository.ts src/cli/lib/db/ git mv src/ingestion/service.ts src/cli/lib/ingestion/ git mv src/ingestion/extractor.ts src/cli/lib/ingestion/ git mv src/ingestion/youtube.ts src/cli/lib/ingestion/ git mv src/ingestion/url.ts src/cli/lib/ingestion/
-
Update imports within CLI to use relative paths
- Change
from "../../llm/client.js"tofrom "../lib/llm/client.js" - Change
from "../../db/repository.js"tofrom "../lib/db/repository.js" - etc.
- Change
-
Verify builds
npm run build npm test -
Commit
git commit -m "Move CLI-only libraries into src/cli/lib/"
-
Create new repo
sourcebase-cli -
Copy files from current repo:
src/cli/→src/src/config/cli.ts→src/config/cli.tssrc/interfaces/cli-types.ts→src/interfaces/cli-types.tssrc/logger.ts→src/logger.ts(temporarily)tests/cli/→tests/migrations/→migrations/package.json→package.json(modify for CLI-only)
-
Update CLI package.json:
- Remove Discord-related dependencies
- Keep all CLI dependencies
- Update bin entry
-
Remove CLI from Bot repo:
rm -rf src/cli/ rm -rf tests/cli/
-
Update Bot to use CLI via npm/npx:
- Update
src/bot/cli-runner.tsto usenpx sourcebase-cli - Or add as dependency and use
node_modules/.bin/sb
- Update
In the CLI repo only:
- Replace LLM client with
openainpm package - Replace Qdrant wrapper with
@qdrant/js-client-rest - Replace logger with
pino(requires sync with Bot repo) - Improve pgvector integration
- Simplify database client
- Move first, refactor later - This enables immediate CLI extraction
- CLI intentionally contains all its dependencies after move - this is correct
- No shared modules - Bot only needs logger and config types
- Git submodule not needed - Cleaner separation
- Bot repo becomes minimal - Only Discord code + subprocess runner
- Refactoring can happen later - In CLI repo independently
See child work items of SB-0MNBSCZDC0021Q2J (Refactor for Libraries epic):
Immediate (Move without refactoring):
- Create task to move libraries into src/cli/lib/
High Priority (Shared libraries):
- Replace logger with pino (SB-0MNBSDUQE008P7CU)
- Improve config validation with zod (SB-0MNBSEQ4T002G2QT)
Future (CLI-only, refactor after extraction):
- Replace LLM client with openai (SB-0MNBSDEV8000N7GA)
- Replace Qdrant wrapper (SB-0MNBSDN010091ST9)
- Improve pgvector integration (SB-0MNBSEABT007T6E2)
- Simplify database client (SB-0MNBSEHU1001WJTR)