Skip to content

Latest commit

 

History

History
333 lines (246 loc) · 9.14 KB

File metadata and controls

333 lines (246 loc) · 9.14 KB

CLI Reference

Lazarus provides a unified command-line interface for training, inference, data generation, and tokenizer utilities.

Installation

After installing the package, the lazarus command is available:

lazarus --help

Commands

train

Train models using SFT or DPO.

train sft

Supervised Fine-Tuning on instruction data.

lazarus train sft --model MODEL --data DATA [OPTIONS]
Option Default Description
--model required Model name or path
--data required Training data path (JSONL)
--eval-data - Evaluation data path
--output ./checkpoints/sft Output directory
--epochs 3 Number of epochs
--batch-size 4 Batch size
--learning-rate 1e-5 Learning rate
--max-length 512 Max sequence length
--use-lora false Enable LoRA
--lora-rank 8 LoRA rank
--mask-prompt false Mask prompt in loss
--log-interval 10 Log every N steps

Example:

lazarus train sft \
  --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 \
  --data ./data/train.jsonl \
  --use-lora \
  --epochs 3

train dpo

Direct Preference Optimization training.

lazarus train dpo --model MODEL --data DATA [OPTIONS]
Option Default Description
--model required Policy model name or path
--ref-model same as model Reference model
--data required Preference data path (JSONL)
--eval-data - Evaluation data path
--output ./checkpoints/dpo Output directory
--epochs 3 Number of epochs
--batch-size 4 Batch size
--learning-rate 1e-6 Learning rate
--beta 0.1 DPO beta parameter
--max-length 512 Max sequence length
--use-lora false Enable LoRA
--lora-rank 8 LoRA rank

Example:

lazarus train dpo \
  --model ./checkpoints/sft/final \
  --data ./data/preferences.jsonl \
  --beta 0.1

generate

Generate synthetic training data.

lazarus generate --type TYPE [OPTIONS]
Option Default Description
--type required Data type (math)
--output ./data/generated Output directory
--sft-samples 10000 Number of SFT samples
--dpo-samples 5000 Number of DPO samples
--seed 42 Random seed

Example:

lazarus generate --type math --output ./data/lazarus --sft-samples 5000

infer

Run inference on a model.

lazarus infer --model MODEL [OPTIONS]
Option Default Description
--model required Model name or path
--adapter - LoRA adapter path
--prompt - Single prompt
--prompt-file - File with prompts
--max-tokens 256 Max tokens to generate
--temperature 0.7 Sampling temperature

Examples:

# Single prompt
lazarus infer --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --prompt "Hello!"

# With adapter
lazarus infer --model model-name --adapter ./checkpoints/lora --prompt "Test"

# Interactive mode (no --prompt)
lazarus infer --model model-name

tokenizer

Tokenizer utilities for inspecting and debugging tokenization.

tokenizer encode

Encode text to tokens and display in a table.

lazarus tokenizer encode -t TOKENIZER [OPTIONS]
Option Default Description
-t, --tokenizer required Tokenizer name or path
--text - Text to encode
-f, --file - File to encode
--special-tokens false Add special tokens

Examples:

# Encode text
lazarus tokenizer encode -t TinyLlama/TinyLlama-1.1B-Chat-v1.0 --text "Hello world"

# Encode file
lazarus tokenizer encode -t model-name --file input.txt --special-tokens

tokenizer decode

Decode token IDs back to text.

lazarus tokenizer decode -t TOKENIZER --ids IDS
Option Default Description
-t, --tokenizer required Tokenizer name or path
--ids required Token IDs (comma or space separated)

Example:

lazarus tokenizer decode -t TinyLlama/TinyLlama-1.1B-Chat-v1.0 --ids "1,2,3,4,5"

tokenizer vocab

Display vocabulary information and search tokens.

lazarus tokenizer vocab -t TOKENIZER [OPTIONS]
Option Default Description
-t, --tokenizer required Tokenizer name or path
--show-all false Show full vocabulary
-s, --search - Search for tokens containing string
--limit 50 Max search results
--chunk-size 1000 Chunk size for full display
--pause false Pause between chunks

Examples:

# Show vocab stats
lazarus tokenizer vocab -t TinyLlama/TinyLlama-1.1B-Chat-v1.0

# Search for tokens
lazarus tokenizer vocab -t model-name --search "hello" --limit 20

# Show full vocabulary
lazarus tokenizer vocab -t model-name --show-all --pause

tokenizer compare

Compare tokenization between two tokenizers.

lazarus tokenizer compare -t1 TOKENIZER1 -t2 TOKENIZER2 --text TEXT
Option Default Description
-t1, --tokenizer1 required First tokenizer
-t2, --tokenizer2 required Second tokenizer
--text required Text to compare

Example:

lazarus tokenizer compare \
  -t1 TinyLlama/TinyLlama-1.1B-Chat-v1.0 \
  -t2 meta-llama/Llama-2-7b-hf \
  --text "The quick brown fox jumps over the lazy dog."

serve

Start the OpenAI-compatible HTTP inference server.

lazarus serve [OPTIONS]
Option Default Description
--model / -m required HuggingFace model ID or local path
--host 0.0.0.0 Bind address
--port / -p 8080 Port
--protocols openai Comma-separated: openai (others planned)
--api-key None Bearer token — if set, all requests must include Authorization: Bearer <key>
--max-tokens 512 Default max_tokens when callers do not specify one

Examples:

# Basic server
lazarus serve --model google/gemma-3-4b-it

# With auth and a custom port
lazarus serve --model google/gemma-3-1b-it --port 9000 --api-key mysecret

# Higher token budget
lazarus serve --model google/gemma-3-1b-it --max-tokens 2048

The standalone lazarus-serve script is an alias for lazarus serve:

lazarus-serve --model google/gemma-3-4b-it

See server.md for the full server guide including endpoints, tool calling, and mcp-cli integration.

introspect

Mechanistic interpretability tools for understanding model internals. See introspection.md for full documentation.

Quick Examples:

# Logit lens analysis
lazarus introspect analyze -m model -p "The capital of France is"

# Activation steering
lazarus introspect steer -m model --extract --positive "good" --negative "bad" -o direction.npz

# Ablation study
lazarus introspect ablate -m model -p "45 * 45 =" -c "2025" --layers 20-23

# Linear probe
lazarus introspect probe -m model --class-a "hard problems" --class-b "easy problems"

# Systematic arithmetic testing
lazarus introspect arithmetic -m model --hard-only

# Uncertainty detection
lazarus introspect uncertainty -m model --prompts "test prompts"

# Multi-class classifier detection (operation classifiers)
lazarus introspect classifier -m model \
  --classes "multiply:7 * 8 = |12 * 5 = " \
  --classes "add:23 + 45 = |17 + 38 = " \
  --test "11 * 12 = |13 + 14 = "

# Logit lens analysis (vocabulary projection)
lazarus introspect logit-lens -m model \
  --prompts "7 * 8 = |23 + 45 = " \
  --targets "multiply" --targets "add"

# Dual reward training (classifier + answer)
lazarus introspect dual-reward -m model --steps 500 --cls-weight 0.4

# MoE expert analysis (semantic trigram methodology)
lazarus introspect moe-expert explore -m openai/gpt-oss-20b

# MoE type detection (pseudo vs native)
lazarus introspect moe-expert moe-type-analyze -m openai/gpt-oss-20b --visualize

# Compare MoE types between models
lazarus introspect moe-expert moe-type-compare -m openai/gpt-oss-20b -c allenai/OLMoE-1B-7B-0924

All introspect subcommands: analyze, compare, generate, hooks, probe, classifier, logit-lens, dual-reward, neurons, directions, operand-directions, embedding, early-layers, activation-cluster, steer, ablate, patch, weight-diff, activation-diff, layer, format-sensitivity, arithmetic, commutativity, metacognitive, uncertainty, memory, memory-inject, circuit (capture, invoke, test, view, compare, decode), moe-expert (explore, domain-test, token-routing, full-taxonomy, moe-type-analyze, moe-type-compare, moe-overlay-compute, moe-overlay-verify, moe-overlay-estimate).

See introspect-moe-expert.md for full MoE expert documentation.

Data Formats

SFT Data (JSONL)

{"prompt": "What is 2+2?", "response": "2+2 equals 4."}
{"prompt": "Explain gravity.", "response": "Gravity is a force..."}

DPO Preference Data (JSONL)

{"prompt": "Question?", "chosen": "Good answer", "rejected": "Bad answer"}