The window GRU model used in this library and proposed by the paper might be very slow to train due to the activation function being relu in the GRU layers. This will not benefit from CUDNN kernel acceleration of RNN and GRUs training and will rely on classic GPU operations (see Keras documentation.
For my project, the WindowGRU model takes more than an hour per epoch using a batch size of 32 applied on two months of data in UK-DALE. I suggest the following implementation of FastWindowGRU model that change the activation function to tanh to benefit from CuDNN acceleration :
from nilmtk_contrib.disaggregate import WindowGRU
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.layers import Dense, Conv1D, GRU, Bidirectional, Dropout
from tensorflow.keras.models import Sequential
class FastWGRU(WindowGRU):
def return_network(self):
'''Creates the GRU architecture described in the paper
'''
model = Sequential()
# 1D Conv
model.add(Conv1D(16,4,activation='relu',input_shape=(self.sequence_length,1),padding="same",strides=1))
# Bi-directional GRUs
model.add(Bidirectional(GRU(64, activation='tanh',
return_sequences=True), merge_mode='concat'))
model.add(Dropout(0.5))
model.add(Bidirectional(GRU(128, activation='tanh',
return_sequences=False), merge_mode='concat'))
model.add(Dropout(0.5))
# Fully Connected Layers
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='linear'))
model.compile(loss='mse', optimizer='adam')
return model
This will offer some substantial speed improvement from 1 hour per epoch to only 3 minutes per epoch and relatively good results.
Let me know if this idea is interesting, I can add a PR for this implementation later on.
The window GRU model used in this library and proposed by the paper might be very slow to train due to the activation function being
reluin the GRU layers. This will not benefit from CUDNN kernel acceleration of RNN and GRUs training and will rely on classic GPU operations (see Keras documentation.For my project, the WindowGRU model takes more than an hour per epoch using a batch size of 32 applied on two months of data in UK-DALE. I suggest the following implementation of FastWindowGRU model that change the activation function to
tanhto benefit from CuDNN acceleration :This will offer some substantial speed improvement from 1 hour per epoch to only 3 minutes per epoch and relatively good results.
Let me know if this idea is interesting, I can add a PR for this implementation later on.