mlx-bitnet-mingpt

This repo contains example code to demonstrate the following:

A port of Karpathy's minGPT in MLX.
An extension of mlx-mingpt to use BitLinear instead of nn.Linear to demonstrate training language models using 2-bits as reported in https://arxiv.org/abs/2402.17764

Reports

Take a peek at the ipynb pdfs under demo-results folder

minGPT in MLX: https://github.com/adhulipa/mlx-mingpt/blob/main/demo-results/mingptmlx.pdf
bitlinear minGPT in MLX: https://github.com/adhulipa/mlx-mingpt/blob/main/demo-results/bitnet-mingptmlx.pdf