Official codebase for NuCLR as presented in "Know Thyself by Knowing Others: Learning Neuron Identity from Population Context"
[ Project Page ]
[ Paper ]
[ Poster ]
[ OpenReview ]
[ Tweet Thread ]
This project has been developed on Python3.10, with environment management using uv. To setup the environment, do:
uv venv venv -p 3.10
source venv/bin/activate
uv pip install -r requirements.txtFollow the steps below to train and evaluate your own NuCLR model.
-
To preprocess datasets, please follow the steps in
preprocess/README.md. -
Download metadata (csv files) about neurons in all four datasets from this link and unzip into
./neuron_metadata.
To train on Electrophysiology data (IBL, Allen, Steinmetz et. al.):
python train.py --config-name train_ephys data=<data-config> num_epochs=<num_epochs>To train on Calcium Imaging data (Bugeon et. al.):
python train.py --config-name train_ca data=<data-config> num_epochs=<num_epochs>- We use Hydra for managing configs.
- Options for
<data-config>can be found inconfigs/data/*.yaml. E.g.data=ibl_bwm_probes_dev - Set
num_epochssuch that the total number of training steps is roughly 50,000. - The checkpoints would be stored in
../ckptby default. - Other available configurations can be found in
configs/train_ephys.yamlandconfigs/train_ca.yaml
A final forward pass over the entire data is needed to get the embeddings from a particular checkpoint. The training script would print a "run_id" for the corresponding run. Use this to run the follwing command:
bash utils/forward_all_epochs.sh <run_id> <data-config-name> [batch_size] [epoch_stride]This would store the embeddings in ../embs/<run_id>/embs_epoch_<num>.pt.
In most cases, you should use the "transductive" versions of each dataset while gathering these embeddings, since we
want to compute embeddings for all neurons here.
Once you have the embeddings, you can follow the instructions in eval_scripts/README.md to run our evaluation scipts.
If you find this repository useful in your research, please consider giving a star ⭐ and a citation
@inproceedings{
arora2025nuclr,
title={Know Thyself by Knowing Others: Learning Neuron Identity from Population Context},
author={Vinam Arora and Divyansha Lachi and Ian J Knight and Mehdi Azabou and Blake Richards and Cole Hurwitz and Joshua H Siegle and Eva L Dyer},
booktitle={Thirty-ninth Conference on Neural Information Processing Systems},
year={2025},
url={https://arxiv.org/abs/2512.01199}
}
