LM-Trainer

A modular and efficient library for training language models with a focus on performance and ease of use.

Overview

LM-Trainer is a comprehensive framework for training, fine-tuning, and evaluating language models. It provides efficient implementations of state-of-the-art techniques and a modular architecture that makes it easy to customize and extend.

Key Features

Modular Architecture: Clean separation of concerns with well-defined modules
Efficiency Optimizations: Fused attention, gradient compression, and memory optimization
Multiple Training Strategies: Support for various optimizers, schedulers, and training techniques
Distributed Training: Built-in support for multi-GPU and distributed training
PEFT Integration: Parameter-Efficient Fine-Tuning methods like LoRA
Comprehensive Evaluation: Rich set of metrics and benchmarking tools
Flexible Configuration: JSON-based configuration system
CLI Tools: Command-line interfaces for training, generation, and evaluation

Installation

# Clone the repository
git clone https://github.com/XenArcAI/LM-Trainer.git
cd LM-Trainer

# Install in development mode
pip install -e .

Project Structure

src/lm_trainer/
├── cli/              # Command-line interface tools
├── core/             # Core utilities, exceptions, and constants
├── data/             # Data loading and preprocessing
├── extensions/       # Third-party integrations (PEFT, Accelerate, etc.)
├── inference/        # Text generation and model serving
├── models/           # Model architectures and components
├── tokenizers/       # Tokenization utilities
├── training/         # Training loop, optimizers, and evaluation
├── utilities/        # Logging, monitoring, and system utilities
└── __init__.py       # Package initialization

Training Module Structure

training/
├── configuration/    # Training and model configuration
├── core/             # Base trainer, callbacks, and checkpointing
├── evaluation/       # Metrics and benchmarking
├── optimization/     # Optimizers, schedulers, and gradient handling
└── __init__.py

Quick Start

Training a Model

# Basic training
lm-train --train-data path/to/train/data --output-dir ./output

# Training with configuration file
lm-train --config configs/medium_model.json --model-config configs/model_config.json

Generating Text

# Generate text with a trained model
lm-generate --model-path ./output/model.pt --prompt "Once upon a time"

Evaluating a Model

# Evaluate a trained model
lm-evaluate --model-path ./output/model.pt --eval-data path/to/eval/data

Configuration

LM-Trainer uses JSON configuration files for flexible setup:

{
  "model": {
    "vocab_size": 32000,
    "d_model": 512,
    "n_heads": 8,
    "n_layers": 6,
    "d_ff": 2048,
    "max_seq_len": 1024
  },
  "training": {
    "batch_size": 16,
    "learning_rate": 5e-5,
    "weight_decay": 0.01,
    "num_epochs": 10,
    "lr_scheduler": "cosine",
    "warmup_steps": 1000,
    "optimizer": "adamw",
    "use_amp": true,
    "save_steps": 1000,
    "eval_steps": 500,
    "logging_steps": 100,
    "device": "auto"
  }
}

Advanced Features

Parameter-Efficient Fine-Tuning

from lm_trainer.extensions.peft import LoraConfig, apply_lora_to_model

# Configure LoRA
lora_config = LoraConfig(r=8, alpha=16, dropout=0.1)

# Apply to model
model, lora_adapter = apply_lora_to_model(model, ["q_proj", "v_proj"], lora_config)

Distributed Training

from lm_trainer.extensions.accelerate import AcceleratorWrapper

# Initialize accelerator
accelerator = AcceleratorWrapper(mixed_precision="fp16")

# Prepare for distributed training
model, dataloader, optimizer = accelerator.prepare(model, dataloader, optimizer)

Custom Training Loop

from lm_trainer.training.core import Trainer
from lm_trainer.training.configuration import TrainingConfig, ModelConfig

# Create configurations
model_config = ModelConfig(vocab_size=32000, d_model=512, n_heads=8, n_layers=6)
training_config = TrainingConfig(batch_size=16, learning_rate=3e-5, num_epochs=5)

# Create trainer
trainer = Trainer(
  model=model,
  train_dataloader=train_loader,
  eval_dataloader=eval_loader,
  optimizer=optimizer,
  config=training_config,
  model_config=model_config
)

# Start training
results = trainer.train()

Performance Optimizations

LM-Trainer includes several performance optimizations:

Fused Attention: Uses PyTorch's scaled_dot_product_attention for efficient attention computation
Mixed Precision: Automatic mixed precision training support
Gradient Compression: For efficient distributed training
Memory Optimization: Gradient checkpointing and efficient memory management
Batch Size Tuning: Automatic batch size adjustment based on available memory
Model Compilation: PyTorch 2.0+ compilation for faster execution

Documentation

API Reference - Complete API documentation
Configuration Guide - Detailed configuration instructions
Training Guide - Comprehensive training tutorial
Inference Guide - Text generation and model serving
Performance Guide - Optimization techniques and best practices

Examples

Check out the examples directory for complete training and inference examples:

Complete Pipeline - End-to-end training and generation
Small Model Training - Quick start example
Text Generation - Text generation with different methods
Advanced Training - Advanced features and optimizations

Configuration Files

Pre-configured setups for different model sizes:

Small Model - Quick experiments and testing
Medium Model - Balanced performance and resource usage
Large Model - High-performance training
CPU Optimized - CPU-friendly configurations

Contributing

We welcome contributions! Please see our contributing guidelines for details.

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

See our contributors list for information about the talented individuals and organizations who have made this project possible.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citations

Main Framework

If you use LM-Trainer in your research, please cite:

@software{lm_trainer,
  author = {LM-Trainer Development Team},
  title = {LM-Trainer: A Modular Library for Efficient Language Model Training},
  year = {2025},
  url = {https://github.com/XenArcAI/LM-Trainer.git}
}

XenArcAI Contributions

This project incorporates significant contributions from XenArcAI to the open source community:

@software{xenarc_ai_contributions,
  author = {XenArcAI},
  title = {Open Source Contributions to Language Model Training Frameworks},
  year = {2025},
  url = {https://github.com/XenArcAI},
  note = {Modular architecture design, performance optimizations, and comprehensive documentation}
}

@software{xenarc_lm_trainer_restructure,
  author = {XenArcAI},
  title = {LM-Trainer: Complete Project Restructuring and Modernization},
  year = {2025},
  url = {https://github.com/XenArcAI/LM-Trainer},
  note = {Complete codebase restructuring, modular organization, and performance enhancements}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
configs		configs
docs		docs
examples		examples
src/lm_trainer		src/lm_trainer
tests		tests
CONTRIBUTORS.md		CONTRIBUTORS.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LM-Trainer

Overview

Key Features

Installation

Project Structure

Training Module Structure

Quick Start

Training a Model

Generating Text

Evaluating a Model

Configuration

Advanced Features

Parameter-Efficient Fine-Tuning

Distributed Training

Custom Training Loop

Performance Optimizations

Documentation

Examples

Configuration Files

Contributing

License

Citations

Main Framework

XenArcAI Contributions

About

Uh oh!

Releases

Packages

Languages

License

XenArcAI/LM-Trainer

Folders and files

Latest commit

History

Repository files navigation

LM-Trainer

Overview

Key Features

Installation

Project Structure

Training Module Structure

Quick Start

Training a Model

Generating Text

Evaluating a Model

Configuration

Advanced Features

Parameter-Efficient Fine-Tuning

Distributed Training

Custom Training Loop

Performance Optimizations

Documentation

Examples

Configuration Files

Contributing

License

Citations

Main Framework

XenArcAI Contributions

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages