Skip to content
/ nuclr Public

Code for "Know Thyself by Knowing Others: Learning Neuron Identity from Population Context" published at NeurIPS 2025

License

Notifications You must be signed in to change notification settings

nerdslab/nuclr

Repository files navigation

NuCLR

NuCLR Logo

Official codebase for NuCLR as presented in "Know Thyself by Knowing Others: Learning Neuron Identity from Population Context"

[ Project Page ] [ Paper ] [ Poster ] [ OpenReview ] [ Tweet Thread ]

NuCLR Architecture Diagram

Usage

This project has been developed on Python3.10, with environment management using uv. To setup the environment, do:

uv venv venv -p 3.10
source venv/bin/activate
uv pip install -r requirements.txt

Follow the steps below to train and evaluate your own NuCLR model.

1. Preprocessing Data

  1. To preprocess datasets, please follow the steps in preprocess/README.md.

  2. Download metadata (csv files) about neurons in all four datasets from this link and unzip into ./neuron_metadata.

2. Training

To train on Electrophysiology data (IBL, Allen, Steinmetz et. al.):

python train.py --config-name train_ephys data=<data-config> num_epochs=<num_epochs>

To train on Calcium Imaging data (Bugeon et. al.):

python train.py --config-name train_ca data=<data-config> num_epochs=<num_epochs>
  • We use Hydra for managing configs.
  • Options for <data-config> can be found in configs/data/*.yaml. E.g. data=ibl_bwm_probes_dev
  • Set num_epochs such that the total number of training steps is roughly 50,000.
  • The checkpoints would be stored in ../ckpt by default.
  • Other available configurations can be found in configs/train_ephys.yaml and configs/train_ca.yaml

3. Evaluation

A final forward pass over the entire data is needed to get the embeddings from a particular checkpoint. The training script would print a "run_id" for the corresponding run. Use this to run the follwing command:

bash utils/forward_all_epochs.sh <run_id> <data-config-name> [batch_size] [epoch_stride]

This would store the embeddings in ../embs/<run_id>/embs_epoch_<num>.pt. In most cases, you should use the "transductive" versions of each dataset while gathering these embeddings, since we want to compute embeddings for all neurons here.

Once you have the embeddings, you can follow the instructions in eval_scripts/README.md to run our evaluation scipts.

Citation

If you find this repository useful in your research, please consider giving a star ⭐ and a citation

@inproceedings{
    arora2025nuclr,
    title={Know Thyself by Knowing Others: Learning Neuron Identity from Population Context},
    author={Vinam Arora and Divyansha Lachi and Ian J Knight and Mehdi Azabou and Blake Richards and Cole Hurwitz and Joshua H Siegle and Eva L Dyer},
    booktitle={Thirty-ninth Conference on Neural Information Processing Systems},
    year={2025},
    url={https://arxiv.org/abs/2512.01199}
}

About

Code for "Know Thyself by Knowing Others: Learning Neuron Identity from Population Context" published at NeurIPS 2025

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published