Minimal training framework driven by a single YAML config. Uses OmegaConf, Hydra instantiate, and a simple train pipeline. Supports Qwen/Qwen3-1.7B SFT on Nvidia GPU.
# From project root (ensure PYTHONPATH includes project root)
python -m train --config nanoflash/config/qwen3_0.6b.yaml
# Override config via CLI
python -m train --config nanoflash/config/qwen3_0.6b.yaml train.batch_size=8 train.max_steps=500- PyTorch (with CUDA for GPU)
- transformers
- datasets
- omegaconf
- hydra-core
pip install torch transformers datasets omegaconf hydra-coreAll components are configured via YAML with class_path pointing to the class or factory:
model: Causal LM (e.g. Qwen3)data: Dataset (e.g. Alpaca-style)optimizer,lr_scheduler,loss,collate_fncheckpoint,logging
Use ${key.subkey} for interpolation (e.g. ${model.model_name_or_path}).
To load components from an external directory, set project in config:
project: /path/to/your/codeThis prepends the path to sys.path so class_path can import from that tree.
- New loss/dataset: Implement the class or factory, then set
class_pathin YAML. No code change in nanoflash. - New recipe: Implement a class with
__init__,setup,train,cleanup; register innanoflash.pipeline.recipe.RECIPES; setrecipe: your_namein YAML.