This repository contains the official implementation of LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models.
State space models (SSMs), particularly Mamba, have emerged as efficient alternatives to transformers for long-context sequence modeling. While recent efforts provide insights into Mamba's internal mechanisms, they don't explicitly decompose token-wise contributions. LaTIM introduces a novel token-level decomposition method for both Mamba-1 and Mamba-2 that enables fine-grained interpretability.
- Implementation of LaTIM decomposition method for Mamba models
- Support for both Mamba-1 and Mamba-2 architectures
- Evaluation tools for machine translation, copying, and retrieval-based tasks
- Visualization utilities for token interaction patterns
pip install -r requirements.txt
The accompanying mamba_ssm
(fork) package is available here.
The codebase includes implementations for:
- Mamba Language Models
- Mamba Machine Translation Models
- Mamba Copy Task Models
The src/models/mamba/interpretability_utils.py
file contains the code required for interpreting and visualizing token-to-token interactions in Mamba models.
Users can follow the different notebooks (src/notebooks/
) to see usage examples.
If you use this code in your research, please cite:
@misc{pitorro2025latimmeasuringlatenttokentotoken,
title={LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models},
author={Hugo Pitorro and Marcos Treviso},
year={2025},
eprint={2502.15612},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.15612},
}