5-Dollar LLM (Blueberry 88M)

Help us build top 10 LLM in the world while keeping it fully open source, which will accelerate everyone and everything that uses LLMs (science, technology, medicine, startups, businesses, etc.)

Check out our contributors leaderboard!

🗺️ Open Superintelligence Lab Roadmap

Our goals:

GPT-1 Level by Dec 20 2025 ✓ Watch
GPT-2 Level by Jan 20 2026
GPT-3 Level by Apr 20 2026
Top 150 in LMArena (GPT-4o-mini level) by June 2026
Top 50 by Apr 2027
Top 10 by Dec 2027
We could aim for Top 1 by 2028, TBD

Can you make our LLM train faster and better?

👉 Full Setup Guide | Leaderboard | Multimodal Guide

🎨 Multimodal Image Generation (Hard Mode)

We have successfully implemented "Hard Mode" Multimodal Image Generation—building a mini-version of Google Parti or DeepSeek Janus from ground zero with zero pre-trained weights.

How it works:

Visual Tokenizer: A custom VQ-VAE compresses 128x128 images into a 32x32 grid of discrete "visual words".
Multimodal Transformer: A 40M parameter Llama-style transformer trained to predict both text and visual tokens in a single unified stream.
Unified Vocabulary: Text (49k) + Image (1k) tokens interleaved: [BOS] {text} <seg_start> {visual_tokens} <seg_end> [EOS].
Optimized Training: Powered by the Muon optimizer and Mixed Precision (Bfloat16), allowing for high-quality image synthesis on a single GPU.

Achievement:

The model has been scaled to 1,000,000 training sequences on CIFAR-10, demonstrating the ability to generate class-specific images (frogs, birds, cars, etc.) from scratch in an autoregressive fashion.

Acceptance criteria:

Once you measure an improvement over the baseline according to the Setup Guide, submit your code in a GitHub pull request.
The LLM must train faster or achieve lower loss on any of the benchmarks (8M, 20M, 100M, 1B tokens).
Lower loss takes priority over training speed because pretraining data is limited - if your submission trains slower but achieves better (lower) loss for the same amount of tokens, it will probably be accepted, and vice versa.
Add as little code as possible, keep it clean, rewrite AI generated pull request descriptions to increase quality.
Submissions are judged case by case, tradeoffs between speed / loss etc. will be taken into account.

🤝 Partners & Support

If you want to write a research paper improving this project, or if you or someone you know has extensive research experience and wants to contribute to this open-source initiative, contact me.

We will partner with compute providers while keeping all research/engineering/code fully open source.

Potential partners include: Hugging Face, NVIDIA, Microsoft, Google, Amazon, Meta, IBM, Oracle, Alibaba, Tencent, Huawei, Baidu, CoreWeave, Lambda Labs, Hyperbolic, Stability AI, OpenAI, Anthropic, xAI, Cohere, Mistral AI, Graphcore, Tenstorrent, Intel, AMD, Dell Technologies, ai2, a16z, Sequoia Capital, and more.

Name		Name	Last commit message	Last commit date
Latest commit History 629 Commits
.github		.github
benchmarks		benchmarks
configs		configs
data		data
docs		docs
models		models
optimizers		optimizers
plots		plots
test_output		test_output
training		training
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_Multimodal.md		README_Multimodal.md
bird_result.png		bird_result.png
frog_result.png		frog_result.png
generate_multimodal.py		generate_multimodal.py
lr_search.py		lr_search.py
requirements.txt		requirements.txt
sampler.py		sampler.py
train.py		train.py
train_llm.py		train_llm.py
train_multimodal_llm.py		train_multimodal_llm.py
train_vqvae.py		train_vqvae.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

5-Dollar LLM (Blueberry 88M)

🗺️ Open Superintelligence Lab Roadmap

🎨 Multimodal Image Generation (Hard Mode)

How it works:

Achievement:

Acceptance criteria:

🤝 Partners & Support

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Contributors 8

Uh oh!

Languages

Uh oh!

License

Open-Superintelligence-Lab/multimodal-llm

Folders and files

Latest commit

History

Repository files navigation

5-Dollar LLM (Blueberry 88M)

🗺️ Open Superintelligence Lab Roadmap

🎨 Multimodal Image Generation (Hard Mode)

How it works:

Achievement:

Acceptance criteria:

🤝 Partners & Support

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Contributors 8

Uh oh!

Languages

Packages