Hi,
cool framework!
note that you add a layer of AvgPool2D with kernel=1 in the class VGG.
This basically doesn't have any effect. Perhaps you meant AdaptiveAveragePool?
In addition, the input for the classification layer is usually 77512, given an input of 224x224.