🚀 Building LLMs from Scratch

A comprehensive, step-by-step journey into building Large Language Models from the ground up

Author: Solomon Eshun

This repository contains the complete source code, explanations, and visualizations for the "Building LLMs from Scratch" series. Whether you're a beginner curious about how ChatGPT works or an experienced developer wanting to understand transformer architecture deeply, this series will guide you through every component step by step.

📚 About This Series

This educational series breaks down the complexity of Large Language Models into digestible, hands-on tutorials. Each part builds upon the previous one, gradually constructing a complete transformer-based language model from scratch using PyTorch.

🎯 Learning Objectives:

Understand the fundamental architecture of transformer models
Implement each component (tokenization, embeddings, attention, etc.) from scratch
Gain practical experience with PyTorch and deep learning concepts
Learn best practices for training and evaluating language models
Explore modern techniques used in state-of-the-art LLMs

👥 Target Audience:

Students and researchers in AI/ML
Software engineers interested in NLP
Anyone curious about how LLMs actually work
Developers wanting to build custom language models

🛣️ Series Roadmap

Part	Topic	Status	Article	Code
01	The Complete Theoretical Foundation	✅ Complete	Medium	N/A
02	Tokenization	✅ Complete	Medium	Code
03	Data Pipeline(Input-Target Pairs)	✅ Complete	Medium	Code
04	Token Embeddings & Positional Encoding	✅ Complete	Medium	Code
05	Complete Data Preprocessing Pipeline	✅ Complete	Medium	Code
06	The Attention Mechanism	✅ Complete	Medium	Code
07	Self-Attention with trainable weights	✅ Complete	Medium	Code
08	Casual Attention	✅ Complete	Medium	Code
0-	Multi-Head Attention	🔄 In progress	Medium	Code
0-	Transformer Blocks & Architecture	⏳ Planned	Medium	Code
0-	Training Loop & Optimization	⏳ Planned	Medium	Code
0-	Model Evaluation & Fine-tuning	⏳ Planned	Medium	Code

Legend: ✅ Complete | 🔄 In Progress | ⏳ Planned

🚀 Quick Start

Prerequisites

Python 3.8 or higher
Basic understanding of Python and neural networks
Familiarity with PyTorch (helpful but not required)

Installation

Clone the repository:

git clone https://github.com/soloeinsteinmit/llm-from-scratch.git
cd llm-from-scratch

Install dependencies:
```
pip install -r requirements.txt
```
Run the first code example (from Part 2):
```
python src/part02_tokenization.py
```

📁 Repository Structure

llm-from-scratch/
├── README.md                 # You are here!
├── requirements.txt          # Python dependencies
├── LICENSE                   # MIT License
│
├── notebooks/                # Jupyter notebooks for interactive learning
│   ├── part02_tokenization.ipynb
│   └── ...
│
├── animations/               # Manim visualizations and diagrams
│   └── part-02-WordTokenizationScene.mp4    # Generated animation files
│
│
└── src/                      # Source code for each part
    ├── part02_tokenization.py
    └── utils/                # Helper functions and utilities

🎓 How to Use This Repository

For Learners

Start with Part 01 on Medium for the theoretical foundation.
Follow Part 02 and subsequent parts for hands-on coding.
Run the code to see practical implementation.
Experiment with the parameters and try modifications.
Check the notebooks for interactive exploration.

For Educators

Use the code examples in your courses
Reference the visualizations for explanations
Adapt the materials for your curriculum
Contribute improvements and additional examples

For Researchers

Use as a foundation for your own model implementations
Reference the clean, well-documented code structure
Build upon the base architecture for your experiments

🎨 Visualizations

This series includes custom Manim animations that visualize complex concepts:

🔄 Attention mechanisms - See how tokens "attend" to each other
📊 Data flow - Understand how information moves through the model
🧮 Matrix operations - Visualize the math behind transformers
📈 Training dynamics - Watch the model learn in real-time

Animations are generated using Manim and available in the animations/ directory.

🤝 Contributing

We welcome contributions from the community! This is an open-source educational project aimed at making LLM understanding accessible to everyone.

Ways to contribute:

🐛 Report bugs or suggest improvements
📝 Improve documentation and explanations
🎨 Create additional visualizations
🔧 Add new features or optimizations
🌍 Translate content to other languages

📖 Additional Resources

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Community: Thanks to all contributors and learners who make this project better
Inspiration: Built upon the excellent work of researchers and educators in the field
Tools: Created with PyTorch, Manim, and lots of coffee ☕

📱 Connect & Follow

📝 Medium: Follow the series on Medium
💼 LinkedIn: Connect and discuss on LinkedIn
🐙 GitHub: Star this repo and follow for updates

⭐ If you find this helpful, please give it a star! It helps others discover this resource.

Built with ❤️ for the open-source community

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 Building LLMs from Scratch

📚 About This Series

🛣️ Series Roadmap

🚀 Quick Start

Prerequisites

Installation

📁 Repository Structure

🎓 How to Use This Repository

For Learners

For Educators

For Researchers

🎨 Visualizations

🤝 Contributing

📖 Additional Resources

Related Articles & Tutorials

📜 License

🙏 Acknowledgments

📱 Connect & Follow

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
animations		animations
data		data
medium		medium
notebooks		notebooks
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

soloeinsteinmit/llm-from-scratch

Folders and files

Latest commit

History

Repository files navigation

🚀 Building LLMs from Scratch

📚 About This Series

🛣️ Series Roadmap

🚀 Quick Start

Prerequisites

Installation

📁 Repository Structure

🎓 How to Use This Repository

For Learners

For Educators

For Researchers

🎨 Visualizations

🤝 Contributing

📖 Additional Resources

Related Articles & Tutorials

📜 License

🙏 Acknowledgments

📱 Connect & Follow

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages