- [2009 ICML] Curriculum Learning, [paper].
- [2010 AISTATS] Understanding the difficulty of training deep feedforward neural networks, [paper].
- [2011 ICML] On Optimization Methods for Deep Learning, [paper], [homepage].
- [2013 ICML] Maxout Networks, [paper], sources: [philipperemy/tensorflow-maxout].
- [2014 JMLR] Dropout: A Simple Way to Prevent Neural Networks from Overfitting, [paper].
- [2015 ICCV] Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, [paper], [Kaiming He's homepage], sources: [nutszebra/prelu_net].
- [2015 ICML] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, [paper], sources: [IsaacChanghau/AmusingPythonCodes/batch_normalization], [tomokishii/mnist_cnn_bn.py].
- [2016 ICLR] Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), [paper].
- [2016 ArXiv] An overview of gradient descent optimization algorithms, [paper], [slides].
- [2016 ArXiv] Layer Normalization, [paper], sources: [ryankiros/layer-norm], [pbhatia243/tf-layer-norm], [NickShahML/tensorflow_with_latest_papers].
- [2016 ICLR] Incorporating Nesterov Momentum into Adam, [paper].
- [2016 ECCV] Layer Dropout: Deep Networks with Stochastic Depth, [paper], [poster], sources: [yueatsprograms/Stochastic_Depth], [samjabrahams/stochastic-depth-tensorflow].
- [2017 NIPS] Self-Normalizing Neural Networks, [paper], sources: [IsaacChanghau/AmusingPythonCodes/selu_activation_visualization], [shaohua0116/Activation-Visualization-Histogram], [bioinf-jku/SNNs], [IsaacChanghau/AmusingPythonCodes/snns].
- [2017 ICLR] Recurrent Batch Normalization, [paper], sources: [cooijmanstim/recurrent-batch-normalization], [jihunchoi/recurrent-batch-normalization-pytorch].