-
Notifications
You must be signed in to change notification settings - Fork 26
Description
Our goals:
- Top 150 in LMArena (GPT-4o-mini level) by April 2026
- Top 50 by Dec 2026
- Top 10 by April 2027
- We could aim for Top 1 by 2028, TBD
Likely architecture for our first LLM (Top 150, April 2026) :
- 8 Billion Parameters
- 15 Trillion Tokens
This requires 300,000 H100 hours or equivalent.
We will partner with one or multiple partners for this compute while keeping all research / engineering / code FULLY open source (and making daily videos on everything we do), for the sake of open science that benefits everyone.
Potential partners include:
Hugging Face, NVIDIA, Microsoft, Google, Amazon, Meta, IBM, Oracle, Alibaba, Tencent, Huawei, Baidu, CoreWeave, Lambda Labs, Hyperbolic, Stability AI, OpenAI, Anthropic, xAI, Cohere, Mistral AI, Graphcore, Tenstorrent, Intel, AMD, Dell Technologies, ai2, a16z, Sequoia Capital, and more.
As a community, we will find ways to get the compute.
Currently LLMs are the most useful AI models, so it's a clear way for us to do useful research. As we gain more experimence, we will expand towards more speculative research that could lead to better AI models.
If you or someone you know has extensive research experience and can offer advisory or leadership support, please contact me.
