This project is aimed at classifying images into three categories: Catch, Clap, and Hammering. The model is built using a neural network (NN) classifier with early stopping to prevent overfitting and improve generalization. The dataset is split into training and testing sets, with transformations applied to resize images to 64x64 pixels and convert them to tensors. The model is trained using a cross-entropy loss function and an SGD optimizer.
- Python 3.x
- PyTorch
- NumPy
- Matplotlib
The dataset consists of images from three categories:
- Catch
- Clap
- Hammering
These images are organized into folders for training and testing. The training data is used to train the neural network, while the testing data is used to evaluate the model's performance.
- Image Preprocessing: Images are resized to 64x64 pixels and converted into tensor format for model input.
- Early Stopping: The EarlyStoppingCriterion class monitors validation loss, and the training process is stopped early if no improvement is observed over a specified number of epochs (patience).
- Model Architecture: The model consists of several fully connected layers with ReLU activation and batch normalization.
- Training and Testing: The training process includes calculating the training loss, training accuracy, and test accuracy. Results are plotted to visualize the model's performance.
The images are resized to 64x64 pixels, and the pixel values are normalized. The dataset is split into training and testing sets with an 80-20 ratio using the splitfolders library.
The model is a fully connected neural network with the following layers:
- Input Layer: Linear transformation to 1024 units
- Hidden Layers: 4 fully connected layers with ReLU activation and batch normalization
- Output Layer: A linear layer with 3 units (one for each class)
An early stopping mechanism is implemented to prevent overfitting. If the validation loss does not improve for a specified number of epochs (patience), the training stops early.
The model is trained for a maximum of 10 epochs. The training loss and accuracy are printed at the end of each epoch, along with the test accuracy. The early stopping mechanism will stop the training once the validation loss stops improving.
The training and test accuracy, as well as the training loss, are plotted after the training process completes.
Sample results from the training:
Displays the training loss per epoch.
The model achieved a training accuracy of 96.83% and a test accuracy of 96.88%, with the test accuracy slightly exceeding the training accuracy. This can be attributed to the use of regularization techniques like early stopping, which helped prevent overfitting and allowed the model to generalize better to the test set. Overall, the model performs well, demonstrating robust generalization and effective image classification.
