EvoLoss is an open-source Python framework that discovers new loss functions using genetic programming. The goal is to automatically evolve symbolic loss expressions that can outperform standard losses (e.g., Cross-Entropy, MSE) in convergence speed and final accuracy.
- Primary Goal: Discover novel, meaningful loss functions that achieve at least 2% better accuracy or 25% faster convergence than standard losses (MSE/CrossEntropy) on common machine learning tasks.
- Convergence Speed: Target 1.1-1.3x faster convergence on MNIST and similar datasets.
- Accuracy: Maintain similar or better accuracy (±1%) compared to standard losses.
- Gradient Stability: Ensure meaningful gradient profiles without explosions or vanishing gradients.
- Best score exceeding baseline performance
- Faster convergence (fewer epochs to reach target accuracy)
- Stable gradients within reasonable bounds
- Formula simplicity and interpretability
- Cross-domain robustness across multiple datasets
- Evolves symbolic expression trees and compiles them into PyTorch-compatible loss functions.
- Supports two evaluation strategies: full training and fast proxy (derivative-based) evaluation.
- Saves results (metrics, plots, reports, checkpoints) under
results/.
- Symbolic loss trees with terminals (
y_pred,y_true,epsilon,one) and a rich set of operators including:- Arithmetic:
+,-,*,/ - Activations:
sigmoid,relu,tanh - Comparisons:
<,>,== - Unary operators:
sin,abs,log,sqrt(with safety mechanisms) - Ternary:
clip
- Arithmetic:
- Evaluation strategies:
full: trains a model for several epochs and aggregates a multi-objective fitness.proxy: fast gradient-based proxy without training.
- Configurable dataset and model:
dataset.loader:module:functionreturning(train_loader, val_loader).model.module:module:functionreturning atorch.nn.Module.
- Parallel evaluation with per-process logging and generation checkpoints.
- HTML report and plots for the best candidate: loss curve, derivative curve, and expression tree visualization.
- Install dependencies:
pip install -r requirements.txt- (Optional) Install pinned dependencies from the lockfile for reproducibility:
pip install -r requirements.lock- Run evolution (example config for MNIST SimpleCNN):
python main.py --config configs/mnist_cnn_config.yaml- Use reduced datasets and capped batches for fast checks:
- FashionMNIST:
python main.py --config configs/fashion_quick.yaml - CIFAR10 (grayscale 28x28):
python main.py --config configs/cifar10_quick.yaml
- FashionMNIST:
Key config options for speed:
dataset:
type: custom
loader: evoloss.data_loaders:load_fashion_mnist # or :load_cifar10_gray28
samples: 3000 # limit dataset size for quick runs
# caps applied in training/evaluation loops
max_train_batches: 80
max_val_batches: 40- Switch evaluation mode:
evaluation: mode: full # or "proxy"
- Custom dataset and model:
dataset: type: custom loader: mypkg.data:load_data # must return (train_loader, val_loader) model: module: mypkg.models:create_model # must return torch.nn.Module
- Training and optimizer:
training: epochs: 5 optimizer: Adam lr: 0.001 weight_decay: 0.0001
- Multi-objective weights:
weights: accuracy: 3.0 complexity: 0.01 gradient: 0.1 speed: 0.02
results/run_log.logand per-process logs when parallel evaluation is enabled.results/stats.csvwith per-generation metrics.results/best_functions.txtandresults/checkpoints/.results/plots/gen_{N}/best_loss.png,best_dloss.png,best_tree.png.results/plots/gen_{N}_best_simplified.txt— symbolic simplification of the best formula (SymPy).results/report.html— an English HTML summary report generated from artifacts.
Generate the report:
python scripts/export_report.py --results results- The repository includes GitHub Actions workflow (
.github/workflows/ci.yml) runningpyteston Python 3.10. - Run tests locally:
pytest -q- Package version exported as
evoloss.__version__. Current:0.1.1.
- Create a new repository on GitHub.
- Initialize and push locally:
git init
git add .
git commit -m "chore: initial public release"
git branch -M main
git remote add origin https://github.com/<your-org>/<your-repo>.git
git push -u origin main- Verify CI passed on GitHub.
evoloss-project/
├── configs/
│ ├── mnist_cnn_config.yaml
│ ├── quick_smoke.yaml
│ └── synth_medium.yaml
├── evoloss/
│ ├── __init__.py
│ ├── symbolic_tree.py
│ ├── evaluation.py
│ ├── evolution.py
│ └── utils.py
├── results/
│ ├── best_functions.txt
│ ├── report.html
│ └── run_log.log
├── scripts/
│ ├── export_report.py
│ └── smoke_test.py
├── tests/
│ ├── test_analytics.py
│ └── test_evolution_ops.py
├── main.py
├── requirements.txt
├── requirements.lock
└── README.md
- Use the lockfile to pin exact versions:
pip freeze > requirements.lock - Install from the lockfile:
pip install -r requirements.lock
- Fast proxy evaluation without training:
evaluation: mode: proxy # or "full"
population_size,tournament_size,elitismmax_tree_depth,mutation_rate,crossover_rategenerations(increased to 6 for better exploration)checkpoint_interval,parallel_evalsimilarity_threshold,diversity_weight- Fitness weights:
w1..w4,grad_threshold
The project includes a cross-domain robustness stage to test evolved formulas across various datasets:
- MNIST (classification)
- FashionMNIST (classification)
- CIFAR10 (image classification)
- Regression datasets
A formula is considered robust when it maintains performance advantages across multiple domains.
The multi-objective fitness function balances several key metrics:
- Accuracy: Primary performance metric (highest weight)
- Gradient Smoothness: Ensures stable training without exploding/vanishing gradients
- Complexity: Penalizes overly complex formulas to favor interpretable solutions
- Speed: Rewards faster convergence during training
- Parallel evaluation (
parallel_eval: true) uses available CPU cores. - Diversity is encouraged via formula similarity (Jaccard) and penalties.
- SymPy simplifications and visualizations are saved for the best individual in each generation.
- The expanded operator set (including
sin,abs,log,sqrt) increases the search space for potentially better loss functions.