Skip to content

CNN training on MNIST does not converge #145

Closed
@milancurcic

Description

@milancurcic
  • cnn_mnist example which trains a CNN network on MNIST data stays at random (10%) accuracy over epochs;
  • cnn_from_keras example which loads a pre-trained CNN from Keras and achieves expected high accuracy (90.14%)

The above suggests that the forward passes of conv2d, maxpool2d, and flatten layers are implemented correctly.

The culprit may be in the implementation of backward methods for any of these layers, or in the backward flow of data.

This should be fixed before the release of v0.13.0.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions