GapT is a gap-filling framework leveraging past, present, and future information alongside relevant covariates to fill in gaps in time series.
Time series data is fundamental to environmental research. However, gaps often occur due to equipment malfunctions, data transmission errors, or adverse environmental conditions. Accurate gap filling is crucial for enhancing the quality of time series data, enabling better modeling, forecasting, and analysis. Traditional approaches include linear interpolation, ARIMA, and
GapT builds on the seq2seq paradigm, introducing architectural improvements that enhance performance in time series gap filling. You can choose from several encoders including dilated 1D convolution, bidirectional LSTM, GRU and LRU, as well as the Transformer. This code is an open source and improved upon version of Richard et al. Filling Gaps in Micro-meteorological Data, ECML PKDD 2020, in PyTorch Lightning.
GapT is evaluated here using aerosol concentration data (
Given a time series of measurements, the objective is to predict missing values within a target sequence using covariates and the incomplete sequence of targets. This is formulated as:
Let
In this formulation,
GapT is designed modularly for flexibility and extensibility. The encoder block can currently be chosen as transformer
, lru
, lstm
, gru
, tcn
, mlp
. The encoder modules are illustrated below:
We also explore two embedding strategies:
- Combined Embedding: Concatenates covariates and targets before embedding.
- Separated Embedding: Embeds covariates and targets separately, then concatenates them.
The embedding variations and the full modular architecture are shown below:
Plots created in notebooks/produce_results.ipynb
Requires Pytorch Lightning installed.
Individual run:
python train.py \
--devices 1 \
--num_workers 32 \
--batch_size 256 \
--epochs 60 \
--n_head 8 \
--n_layers 6 \
--d_model 128 \
--d_feedforward 256 \
--learning_rate 0.01 \
--dropout_rate 0.2 \
--optimizer momo \
--model gapt \
--data_dir data/two_week_seq \
--output_dir results/gapt
Full experiment:
bash submit.sh
Baseline
@inproceedings{richard2021,
title={Filling gaps in micro-meteorological data},
author={Richard, Antoine and Fine, Lior and Rozenstein, Offer and Tanny, Josef and Geist, Matthieu and Pradalier, Cedric},
booktitle={Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14--18, 2020, Proceedings, Part V},
pages={101--117},
year={2021},
}
GapT
@misc{gapt2025,
title={GapT: Gap-filling Transformer for Multivariate Timeseries},
author={Holmberg, Daniel},
year={2025},
howpublished={\url{https://github.com/deinal/gapt}},
}