This is the official repository for MouseSIS: A Frames-and-Events Dataset for Space-Time Instance Segmentation of Mice, accepted at the Workshop on Neuromorphic Vision in conjunction with ECCV 2024 by Friedhelm Hamann, Hanxiong Li, Paul Mieske, Lars Lewejohann and Guillermo Gallego.
π This dataset the base for the SIS Challenge hosted in conjunction with the CVPR 2025 Workshop on Event-based Vision.
- Space-time instance segmentation dataset focused on mice tracking
- Combined frames and event data from neuromorphic vision sensor
- 33 sequences (~20 seconds each, ~600 frames per sequence)
- YouTubeVIS-style annotations
- Baseline implementation and evaluation metrics included
- v1.0.0 (Current, February 2024): Major refactoring and updates, including improved documentation.
- v0.1.0 (September 2023): Initial release with basic functionality and dataset.
- Quickstart
- Installation
- Data Preparation
- Evaluation
- Acknowledgements
- Citation
- Additional Resources
- License
If you want to work with the dataset the quickest way to access the data and get an idea of it's structure is downloading one sequence and the annotations of the according split and visualizing the data, e.g. seq12.h5
:
python scripts/visualize_events_frames_and_masks.py --h5_path data/MouseSIS/top/val/seq12.h5 --annotation_path data/MouseSIS/val_annotations.json
This requires h5py, numpy, Pillow, tqdm
. The full dataset structure is explained here.
-
Clone the repository:
git clone [email protected]:tub-rip/MouseSIS.git cd MouseSIS
-
Set up the environment:
conda create --name MouseSIS python=3.8 conda activate MouseSIS
-
Install PyTorch (choose a command compatible with your CUDA version from the PyTorch website), e.g.:
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia
-
Install other dependencies:
pip install -r requirements.txt
-
Create a folder for the original data
cd <project-root> mkdir -p data/MouseSIS
-
Download the data and annotation and save it in
<project-root>/data/MouseSIS
. You do not necessarily need to download the whole dataset, e.g. you can only download the sequences needed for the sequences you want to evaluate on. Thedata/MouseSIS
folder should be organized as follows:data/MouseSIS β βββ top/ β βββ train β β βββ seq_02.hdf5 β β βββ seq_05.hdf5 β β βββ ... β β βββ seq_33.hdf5 | βββ val β β βββ seq_03.hdf5 β β βββ seq_04.hdf5 β β βββ ... β β βββ seq_25.hdf5 β βββ test β βββ seq_01.hdf5 β βββ seq_07.hdf5 β βββ ... β βββ seq_32.hdf5 βββ dataset_info.csv βββ val_annotations.json βββ train_annotations.json
top/
: This directory contains the frame and event data for the Mouse dataset captured from top view, stored as 33 individual.hdf5
files, each containing approximately 20 seconds of data (around 600 frames), along with temporally aligned events.dataset_info.csv
: This CSV file contains metadata for each sequence, such as recording dates, providing additional context and details about the dataset.<split>_annotations.json
: The annotation file of top view for the respective splits follows a structure similar to MSCOCO's format in JSON, with some modifications. Note that the test annotations are not publicly available. The definition of json files is:
{ "info": { "description": "string", // Dataset description "version": "string", // Version identifier "date_created": "string" // Creation timestamp }, "videos": [ { "id": "string", // Video identifier (range: "01" to "33") "width": integer, // Frame width in pixels (1280) "height": integer, // Frame height in pixels (720) "length": integer // Total number of frames } ], "annotations": [ { "id": integer, // Unique instance identifier "video_id": "string", // Reference to parent video "category_id": integer, // Object category (1 = mouse) "segmentations": [ { "size": [height: integer, width: integer], // Mask dimensions "counts": "string" // RLE-encoded segmentation mask } ], "areas": [float], // Object area in pixels "bboxes": [ // Bounding box coordinates [x_min: float, y_min: float, width: float, height: float] ], "iscrowd": integer // Crowd annotation flag (0 or 1) } ], "categories": [ { "id": integer, // Category identifier "name": "string", // Category name "supercategory": "string" // Parent category } ] }
Download the model weights:
cd <project-root>
mkdir models
# Download yolo_e2vid.pt, yolo_frame.pt, and XMem.pth from the provided link
# and place them in the models directory
Afterwards, the models
folder should be organized as follows:
models
βββ XMem.pth
βββ yolo_e2vid.pt
βββ yolo_frame.pt
This preprocessing step is required only when evaluating the ModelMixSort method from the paper. It relies on e2vid images reconstructed at the grayscale image timesteps.
python scripts/preprocess_events_to_e2vid_images.py --data_root data/MouseSIS
After downloading the data and model weights, proceed with evaluation. First run inference, e.g. our provided inference script like:
python3 scripts/inference.py --config <path-to-config-yaml>
This saves a file output/<tracker-name>/final_results.json
. The file contains the predictions in this structure:
[
{
"video_id": int,
"score": float,
"instance_id": int,
"category_id": int,
"segmentations": [
null | {
"size": [int, int],
"counts": "RLE encoded string"
},
...
],
},
...
]
Then run the evaluation script like this:
python src/TrackEval/run_mouse_eval.py --TRACKERS_TO_EVAL <tracker-name> --SPLIT_TO_EVAL <split-name>
Below are specific options listed.
This section describes how to run a minimal evaluation workflow on one sequence of the validation set. Only download the sequence seq_25.hdf5
from the validation set and the according annotations val_annotations.json
. The resulting folder should look as follows:
data/MouseSIS
β
βββ top/
| βββ val
β β βββ seq_25.hdf5
βββ val_annotations.json
Now you can run inference as
python3 scripts/inference.py --config configs/predict/quickstart.yaml
and then evaluation as
python scripts/eval.py --TRACKERS_TO_EVAL quickstart --SPLIT_TO_EVAL val
This should return the following results
Sequence | HOTA | MOTA | IDF1 |
---|---|---|---|
25 | 30.15 | 39.125 | 35.315 |
Avg. | 30.15 | 39.125 | 35.315 |
Similar as for quickstart but download all sequences of the validation set (sequences 3, 4, 12, 25).
python3 scripts/inference.py --config configs/predict/combined_on_validation.yaml
python scripts/eval.py --TRACKERS_TO_EVAL combined_on_validation --SPLIT_TO_EVAL val
Here you should get the following results
Sequence | HOTA | MOTA | IDF1 |
---|---|---|---|
3 | 54.679 | 72.432 | 60.212 |
4 | 51.717 | 64.942 | 58.36 |
12 | 39.497 | 66.049 | 45.431 |
25 | 30.15 | 39.125 | 35.315 |
Avg. | 45.256 | 62.097 | 50.459 |
In this case, download all test sequences and run
python3 scripts/inference.py --config configs/predict/sis_challenge_baseline.yaml
For evaluation you can upload the final_results.json
to the challenge/benchmark page, which results in the following combined metrics:
Sequence | HOTA | MOTA | IDF1 |
---|---|---|---|
Avg. | 0.43 | 0.45 | 0.5 |
Please note that results vary slightly from the ones reported in the paper after updates for the challenge. Please refer to version v0.1.0 to reproduce the exact paper results.
We greatfully appreciate the following repositories and thank the authors for their excellent work:
If you find this work useful in your research, please consider citing:
@inproceedings{hamann2024mousesis,
title={MouseSIS: A Frames-and-Events Dataset for Space-Time Instance Segmentation of Mice},
author={Hamann, Friedhelm and Li, Hanxiong and Mieske, Paul and Lewejohann, Lars and Gallego, Guillermo},
booktitle={Proceedings of the European Conference on Computer Vision Workshops (ECCVW)},
year={2024}
}
- Recording Software (CoCapture)
- TU Berlin, RIP lab Homepage
- Science Of Intelligence Homepage
- Event Camera Class at TU Berlin
- Event-based Vision Survey Paper
- List of Event Vision Resources
This project is licensed under the MIT License - see the LICENSE file for details.