TinyMamba

TinyMamba is a minimal, educational implementation of a hybrid language model architecture combining state-space modeling (SSM) and transformer attention. It is designed for clarity, extensibility, and experimentation with memory-based reasoning and context-dependent computation.

Overview

TinyMamba implements a compact neural language model that mixes ideas from the Mamba selective state space model and standard Transformer blocks. It includes:

A state-space token mixer for sequential processing and implicit positional encoding
An optional Transformer attention layer for global context
Persistent and decaying internal states that carry memory across generations
Context-dependent gating, allowing the model to modulate how new information updates its internal state

The model can save and reload its internal state between sessions, effectively allowing it to preserve long-term context or gradually forget old information through a decay mechanism.

Features

Simple modular code structure using PyTorch
Hybrid Mamba–Transformer block design
Configurable decay rate for temporal memory
Optional goal conditioning via contextual gating
Training-ready setup with AdamW optimizer
Persistent memory saving to ./state/ directory

Usage

uv run tiny_mamba.py   # You should see 'Tiny Mamba!' and the REPL will start

Intended Purpose

TinyMamba is not meant as a production model. It serves as a learning scaffold for exploring:

State-space architectures for sequence modeling
Hybridization of recurrent and attention-based computation
Persistent and goal-conditioned memory mechanisms in language models

Research Motivation

TinyMamba is part of an ongoing exploration into Goal-Conditioned State Space Reasoners (GSSR), a theoretical architecture that maintains evolving internal states influenced by both recent inputs and explicit goals. The aim is to understand how stateful architectures can extend the reasoning horizon of language models without relying solely on external memory or windowed context.

By combining differentiable memory, context-dependent gating, and hybrid attention mechanisms, TinyMamba provides a conceptual foundation for studying how long-term, self-updating state representations can be used to guide generative reasoning.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
docs		docs
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
config.py		config.py
goal_lbs.py		goal_lbs.py
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TinyMamba

Overview

Features

Usage

Intended Purpose

Research Motivation

About

Uh oh!

Releases

Packages

Languages

wallscreet/tinymamba

Folders and files

Latest commit

History

Repository files navigation

TinyMamba

Overview

Features

Usage

Intended Purpose

Research Motivation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages