Skip to content

A modular and efficient library for training language models with a focus on performance and ease of use.

License

Notifications You must be signed in to change notification settings

XenArcAI/LM-Trainer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LM-Trainer

A modular and efficient library for training language models with a focus on performance and ease of use.

Overview

LM-Trainer is a comprehensive framework for training, fine-tuning, and evaluating language models. It provides efficient implementations of state-of-the-art techniques and a modular architecture that makes it easy to customize and extend.

Key Features

  • Modular Architecture: Clean separation of concerns with well-defined modules
  • Efficiency Optimizations: Fused attention, gradient compression, and memory optimization
  • Multiple Training Strategies: Support for various optimizers, schedulers, and training techniques
  • Distributed Training: Built-in support for multi-GPU and distributed training
  • PEFT Integration: Parameter-Efficient Fine-Tuning methods like LoRA
  • Comprehensive Evaluation: Rich set of metrics and benchmarking tools
  • Flexible Configuration: JSON-based configuration system
  • CLI Tools: Command-line interfaces for training, generation, and evaluation

Installation

# Clone the repository
git clone https://github.com/XenArcAI/LM-Trainer.git
cd LM-Trainer

# Install in development mode
pip install -e .

Project Structure

src/lm_trainer/
├── cli/              # Command-line interface tools
├── core/             # Core utilities, exceptions, and constants
├── data/             # Data loading and preprocessing
├── extensions/       # Third-party integrations (PEFT, Accelerate, etc.)
├── inference/        # Text generation and model serving
├── models/           # Model architectures and components
├── tokenizers/       # Tokenization utilities
├── training/         # Training loop, optimizers, and evaluation
├── utilities/        # Logging, monitoring, and system utilities
└── __init__.py       # Package initialization

Training Module Structure

training/
├── configuration/    # Training and model configuration
├── core/             # Base trainer, callbacks, and checkpointing
├── evaluation/       # Metrics and benchmarking
├── optimization/     # Optimizers, schedulers, and gradient handling
└── __init__.py

Quick Start

Training a Model

# Basic training
lm-train --train-data path/to/train/data --output-dir ./output

# Training with configuration file
lm-train --config configs/medium_model.json --model-config configs/model_config.json

Generating Text

# Generate text with a trained model
lm-generate --model-path ./output/model.pt --prompt "Once upon a time"

Evaluating a Model

# Evaluate a trained model
lm-evaluate --model-path ./output/model.pt --eval-data path/to/eval/data

Configuration

LM-Trainer uses JSON configuration files for flexible setup:

{
  "model": {
    "vocab_size": 32000,
    "d_model": 512,
    "n_heads": 8,
    "n_layers": 6,
    "d_ff": 2048,
    "max_seq_len": 1024
  },
  "training": {
    "batch_size": 16,
    "learning_rate": 5e-5,
    "weight_decay": 0.01,
    "num_epochs": 10,
    "lr_scheduler": "cosine",
    "warmup_steps": 1000,
    "optimizer": "adamw",
    "use_amp": true,
    "save_steps": 1000,
    "eval_steps": 500,
    "logging_steps": 100,
    "device": "auto"
  }
}

Advanced Features

Parameter-Efficient Fine-Tuning

from lm_trainer.extensions.peft import LoraConfig, apply_lora_to_model

# Configure LoRA
lora_config = LoraConfig(r=8, alpha=16, dropout=0.1)

# Apply to model
model, lora_adapter = apply_lora_to_model(model, ["q_proj", "v_proj"], lora_config)

Distributed Training

from lm_trainer.extensions.accelerate import AcceleratorWrapper

# Initialize accelerator
accelerator = AcceleratorWrapper(mixed_precision="fp16")

# Prepare for distributed training
model, dataloader, optimizer = accelerator.prepare(model, dataloader, optimizer)

Custom Training Loop

from lm_trainer.training.core import Trainer
from lm_trainer.training.configuration import TrainingConfig, ModelConfig

# Create configurations
model_config = ModelConfig(vocab_size=32000, d_model=512, n_heads=8, n_layers=6)
training_config = TrainingConfig(batch_size=16, learning_rate=3e-5, num_epochs=5)

# Create trainer
trainer = Trainer(
  model=model,
  train_dataloader=train_loader,
  eval_dataloader=eval_loader,
  optimizer=optimizer,
  config=training_config,
  model_config=model_config
)

# Start training
results = trainer.train()

Performance Optimizations

LM-Trainer includes several performance optimizations:

  • Fused Attention: Uses PyTorch's scaled_dot_product_attention for efficient attention computation
  • Mixed Precision: Automatic mixed precision training support
  • Gradient Compression: For efficient distributed training
  • Memory Optimization: Gradient checkpointing and efficient memory management
  • Batch Size Tuning: Automatic batch size adjustment based on available memory
  • Model Compilation: PyTorch 2.0+ compilation for faster execution

Documentation

Examples

Check out the examples directory for complete training and inference examples:

Configuration Files

Pre-configured setups for different model sizes:

Contributing

We welcome contributions! Please see our contributing guidelines for details.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

See our contributors list for information about the talented individuals and organizations who have made this project possible.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citations

Main Framework

If you use LM-Trainer in your research, please cite:

@software{lm_trainer,
  author = {LM-Trainer Development Team},
  title = {LM-Trainer: A Modular Library for Efficient Language Model Training},
  year = {2025},
  url = {https://github.com/XenArcAI/LM-Trainer.git}
}

XenArcAI Contributions

This project incorporates significant contributions from XenArcAI to the open source community:

@software{xenarc_ai_contributions,
  author = {XenArcAI},
  title = {Open Source Contributions to Language Model Training Frameworks},
  year = {2025},
  url = {https://github.com/XenArcAI},
  note = {Modular architecture design, performance optimizations, and comprehensive documentation}
}
@software{xenarc_lm_trainer_restructure,
  author = {XenArcAI},
  title = {LM-Trainer: Complete Project Restructuring and Modernization},
  year = {2025},
  url = {https://github.com/XenArcAI/LM-Trainer},
  note = {Complete codebase restructuring, modular organization, and performance enhancements}
}

About

A modular and efficient library for training language models with a focus on performance and ease of use.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages