CS229-TimeSeries-LSTM

Introduction

In traditional Econometrics, there're various forms of time series models. The calibration methodology and theoretical guarantees of the estimator are usually tailored to each specific model.

With the nascence of NN, it's almost imperative to do a study to see how well a generic LSTM model could fit to time series. In particular, we'd simulate some time series with latent states and see if the cells' contents in fitted LSTM are able to "memorise/match" the latent states in the simulated time series.

Methodology

Peepholed-version of LSTM
SGD with Momentum
Simulate 5000 uni-variate trajectories for training, 500 for test
The first 50 steps on each trajectory are for "burn-in"

Step-by-Step:

Fit LSTM on simulated training trajectories
Extract the cells' contents from the fitted LSTM on training trajectories
Regress/classify the ground-truth latent states in the time series against the cells' contents in LSTM -- still on training trajectories
Run the regressor/classifier learnt above on the test trajectories (run beforehand the fitted LSTM on the test trajectories to extract the cells' contents)

Results

50%-Quantile

If we fit against the 50%-quantile loss, for a Stochastic Volatility process, the Spearman correlation is around 0.55+/-0.2 whether we fit a Linear Regressor or a Random Forest Regressor.

For a Regime-Switching process, the Cross-Entropy is 0.64+/-0.1 for a Logistic Classifier and 0.55+/-0.1 for a Random Forest Classifier.

There is barely moderate evidence that LSTM truly "understood" and learnt the internal state of the simulated time series.

Full Quantiles

However if we perform 5 regressions, for 5%-, 25%-, 50%-, 75%-, 95%-quantiles respectively, for the Stochastic Volatility process, the latent state (i.e. the variance) of the time series would be much better learnt if we pool together the cell contents of the 5 fitted LSTMs.

For the regime-switching model, full quantile regressions would still be unable to learn the latent state, possibly due to the fact that fitting high-variance variables takes precedence over low-variance variables in regression. To improve over this, one idea would be to directly feed latent state-relevant information to LSTM. As a simplest and almost silly way to verify this, we feed the ground-truth regime to LSTM for fitting. The cell contents of fitted LSTMs would now be able to learn the regime of the time series.

Conclusion

To learn the internal state of time series, we would have to model the full distribution of the realisation, and feed information directly relevant to the regime of the series.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
doc		doc
.gitignore		.gitignore
FitFullQuantiles.ipynb		FitFullQuantiles.ipynb
FitGarch.ipynb		FitGarch.ipynb
FitSimulatedTS.ipynb		FitSimulatedTS.ipynb
LICENSE		LICENSE
README.md		README.md
report.pdf		report.pdf
rnn_analysis.py		rnn_analysis.py
tspred.py		tspred.py
tspred_qtl.py		tspred_qtl.py
tssimul.py		tssimul.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS229-TimeSeries-LSTM

Introduction

Methodology

Results

50%-Quantile

Full Quantiles

Conclusion

About

Releases

Packages

Languages

License

jli05/CS229-TimeSeries-LSTM

Folders and files

Latest commit

History

Repository files navigation

CS229-TimeSeries-LSTM

Introduction

Methodology

Results

50%-Quantile

Full Quantiles

Conclusion

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages