Skip to content

Commit b5f1619

Browse files
committed
Update README with all implemented features and configuration options
1 parent 111cc2d commit b5f1619

File tree

1 file changed

+44
-4
lines changed

1 file changed

+44
-4
lines changed

README.md

Lines changed: 44 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,15 @@ This project implements a proof‑of‑concept evaluation‑driven fine‑tuning
6161

6262
Such a loop can be particularly useful for domains where quality requirements are high and failure modes are diverse (e.g., legal drafting, safety moderation, tutoring).
6363

64+
## Features
65+
66+
- **Proper Tinker Integration**: Uses renderers for correct loss masking, async futures for performance, and recommended LR schedules
67+
- **EvalOps Integration**: Optional automatic submission of evaluation results for centralized tracking
68+
- **Pydantic Config Validation**: Type-safe configuration with clear error messages
69+
- **Production-Grade Hyperparameters**: Model-specific LR formula, warmup/cosine scheduling, configurable LoRA rank
70+
- **Async Batching**: Overlapping forward/backward and optimizer steps for faster training
71+
- **Comprehensive Tests**: 37 unit and integration tests covering all components
72+
6473
## Usage overview
6574

6675
This project contains two main components:
@@ -70,9 +79,11 @@ This project contains two main components:
7079
| `trainer_with_eval.py` | The main script that orchestrates training and evaluation. It connects to Tinker, creates a LoRA training client, runs fine‑tuning, performs evaluations via Inspect AI, and decides whether to launch further training rounds. |
7180
| `eval_loop_config.json` | A sample configuration file specifying the base model, dataset paths, evaluation tasks, thresholds and hyperparameters. |
7281
| `evalops_client.py` | Python SDK for submitting evaluation results to EvalOps platform. |
73-
| `config_schema.py` | Pydantic schema for configuration validation. |
74-
| `data_loader.py` | JSONL data loader with validation, deduplication, and tokenization. |
82+
| `config_schema.py` | Pydantic schema for configuration validation with hyperparameter tuning. |
83+
| `data_loader.py` | JSONL data loader with proper Tinker renderers, loss masking, validation, and deduplication. |
7584
| `data_selector.py` | Utilities for mining hard examples based on evaluation failures. |
85+
| `hyperparam_utils.py` | Tinker's recommended LR formula and warmup/cosine scheduler. |
86+
| `simple_eval.py` | Minimal working evaluator for demo (replace with Inspect AI for production). |
7687
| `requirements.txt` | Dependencies required to run the script. |
7788
| `tests/` | Unit and integration tests for all components. |
7889

@@ -146,9 +157,28 @@ To use EvalOps integration:
146157

147158
The client will automatically submit results after each evaluation round, making it easy to track progress over time and compare different fine-tuning runs.
148159

160+
## Configuration Options
161+
162+
Key configuration parameters in `eval_loop_config.json`:
163+
164+
| Parameter | Default | Description |
165+
|-----------|---------|-------------|
166+
| `base_model` | - | Model to fine-tune (e.g., "meta-llama/Llama-3.1-8B-Instruct") |
167+
| `lora_rank` | 16 | LoRA adapter rank (1-256) |
168+
| `learning_rate` | 0.0001 | Initial learning rate |
169+
| `use_recommended_lr` | false | Use Tinker's model-specific LR formula |
170+
| `warmup_steps` | 100 | LR warmup steps |
171+
| `max_steps` | 1000 | Total training steps for cosine decay |
172+
| `batch_size` | 8 | Training batch size |
173+
| `steps_per_round` | 1 | Training steps per evaluation round |
174+
| `eval_threshold` | 0.85 | Minimum score to stop training |
175+
| `max_rounds` | 3 | Maximum training rounds |
176+
| `renderer_name` | "llama3" | Renderer for proper chat formatting |
177+
| `evalops_enabled` | false | Enable EvalOps integration |
178+
149179
## Extending this project
150180

151-
This is a minimal prototype to demonstrate how to build a useful system on top of Tinker. Future extensions could include:
181+
This is a production-ready prototype demonstrating best practices from Tinker documentation. Future extensions could include:
152182

153183
- **Custom data selection** based on evaluation feedback. For example, automatically mine additional examples from your corpora that match prompts where the model performs poorly.
154184

@@ -171,6 +201,16 @@ The test suite includes:
171201
- **Integration tests** for the training loop with mocked Tinker/EvalOps services
172202
- **Coverage** for early stopping, LR decay, and error handling
173203

204+
## Implementation Notes
205+
206+
**Based on Tinker Documentation:**
207+
- Uses `renderers.build_supervised_example()` for proper loss masking (trains only on assistant outputs)
208+
- Implements async futures with `forward_backward_async()` and `optim_step_async()` for performance
209+
- Uses `save_weights_for_sampler()` for evaluation (not `save_state()` which includes optimizer state)
210+
- Supports Tinker's recommended LR formula: `LR = 5e-5 × 10 × (2000/H_m)^P_m` with model-specific exponents
211+
- Includes warmup + cosine decay scheduler for stable training
212+
- Gracefully falls back when tinker-cookbook unavailable (for testing/development)
213+
174214
## Disclaimer
175215

176-
This code does not run training jobs by itself; it serves as a scaffold. You'll need an active Tinker API key and appropriate computing quotas to execute the training and evaluation steps. Modify the script to fit your particular needs and model lineup. The hyperparameters and thresholds in the sample config are placeholders and should be adjusted based on your use case and dataset size.
216+
This code requires an active Tinker API key and appropriate computing quotas to execute training and evaluation. The implementation follows Tinker's documented best practices and is suitable for production use with real evaluation tasks. The simple evaluator is for demo purposes only—replace with Inspect AI integration for production deployments.

0 commit comments

Comments
 (0)