This project demonstrates how to classify banana ripeness (unripe, ripe, overripe, rotten) using a deep learning model built on top of MobileNetV2. By applying data augmentation, splitting data into training, validation, and testing sets, and fine-tuning hyperparameters, the system achieves high accuracy in predicting the ripeness level of bananas.
- ๐ Data Source: Kaggle dataset and custom-collected images covering four ripeness categories.
- ๐งน Data Cleaning: Organizing images into separate folders by ripeness level.
- ๐ Data Transformation: Standardizing image format (
.JPG) for uniform processing.
- ๐ค MobileNetV2 Pre-trained on ImageNet, then fine-tuned for this specific 4-class classification.
- ๐คธ Data Augmentation: Rotation, zoom, shifting, shearing, flipping for enhanced robustness.
- ๐ Hyperparameters:
rotation_range=100,epochs=13,optimizer='adam', and more.
- ๐ฏ Evaluation: Achieved up to 97.29% accuracy.
- โฑ๏ธ Validation Loss Monitoring with EarlyStopping to avoid overfitting.
- ๐ Identified Deficiencies: Misclassification issues and version discrepancies (TensorFlow & Keras).
- ๐ง Adjustments: Switching data sources, refining image resolutions, ensuring environment consistency.
- โก Potential Real-time Web Deployment for immediate ripeness detection and minimal human intervention.
- ๐ Further Extensions: Integration with IoT sensors (temperature/humidity) to predict optimal harvest windows.
- ๐ Python 3.x
- ๐ค TensorFlow 2.x / Keras
- ๐ NumPy & Pandas for data handling
- ๐ Matplotlib & Seaborn for visualizations
- ๐งช Sklearn for data splitting and performance metrics
- ๐ GPU Support (optional but recommended for faster training)
- Currently, Thailand is experiencing a banana ๐ shortage, leading to reduced banana production. Therefore, it is crucial to maximize the use of the limited banana supply. The โBanana Testerโ ๐ was created to evaluate the ripeness level of bananas, as relying solely on human visual inspection of peel color and softness can be highly uncertainโsubject to individual experience and environmental conditions. Moreover, fluctuating weather significantly affects the ripening rate, making it difficult to predict the ideal harvest time. Overripe or under-ripe bananas cannot be sold, resulting in economic losses, potential taste changes, and higher spoilage risks. Additionally, inconsistent ripeness levels among bananas increase labor and time spent sorting.
-
Research References: Google Drive Link
-
Strengths and Weaknesses of Previous Research:
From the studies reviewed, there are diverse methods of data collection, and the approaches for classifying banana ripeness are highly detailed. However, such methods can be time-consuming for data gathering. In contrast, our current experimental project involves faster data collection but still achieves comparable results. -
Identified Gap:
Existing research lacks a user-friendly program or website capable of real-time ๐ analysis of banana ripeness. Addressing this gap is the key motivation for our systemโs development.
- Connecting Previous Research with Our Approach:
Both the reviewed studies and our current project share a similar objectiveโsolving the uncertainty in human-based banana ๐ ripeness classification. Such misclassification often leads to waste from prematurely rotten bananas. Therefore, the main aim of these studies and our project is to increase the accuracy of banana ripeness classification by leveraging AI ๐ค. Ultimately, this extends to creating a real-time web-based tool for easy and immediate use.
- Bananas with four different ripeness levels (unripe, ripe, overripe, rotten) were obtained from:
- Kaggle Dataset ๐
- Google Drive Link ๐ (the drive link used for training)
- Images are categorized into folders according to their ripeness level.
- All images used in this project are standardized to
.JPGformat. Each ripeness category is placed in its corresponding folder.
- Supervised Learning: Classification
We employ a classification approach to distinguish between the four banana ripeness levels.
rotation_range=100โช Randomly rotate images within ยฑ100 degreeszoom_range=0.05โช Randomly zoom images up to 5%width_shift_range=0.5โช Shift images horizontally by up to 50% of the widthheight_shift_range=0.5โช Shift images vertically by up to 50% of the heightshear_range=0.15โช Apply shear transformations up to 15%horizontal_flip=Trueโช Enable horizontal flippingfill_mode="nearest"โช Fill in empty pixels using the nearest value
validation_split=0.2โช Reserve 20% of the data for validation; 80% is used for trainingtrain_size=0.7โช 70% of the dataset is used for training, and 30% for testing
- MobileNetV2 Pre-trained Model
input_shape=(224, 224, 3)โช Input images have a resolution of 224ร224ร3include_top=Falseโช The final classification layers of MobileNetV2 are excluded
- Dense Layers
- 1 dense layer with 64 neurons,
activation='relu' - 1 dense layer with 32 neurons,
activation='relu' - Final dense layer with 4 neurons (
activation='softmax') for the 4 classes
- 1 dense layer with 64 neurons,
- Dropout Layers
- Dropout rate of 0.5 (50%) to help mitigate overfitting
optimizer='adam'โช Use Adam optimizer to adjust model weightsloss='binary_crossentropy'โช Use binary cross-entropy as the loss function
batch_size=32โช Number of examples processed in each weight update stepepochs=13โช Total number of training epochs- EarlyStopping
monitor='val_loss'โช Monitor validation losspatience=3โช Stop training if validation loss does not improve after 3 consecutive epochsrestore_best_weights=Trueโช Revert to the best weights obtained during training
shuffle=Falseโช Disable shuffling of training samples (in certain generator setups)seed=0โช Fix the random seed for reproducible results
- In this project, we use MobileNetV2 (pre-trained on ImageNet) and fine-tune it for banana ripeness classification. ๐
- Accuracy: 97.29% ๐
- After testing, our model achieved an accuracy of 97.29%. This score represents the proportion of correctly predicted outcomes compared to the total number of predictions, indicating high model performance. ๐
- Some instances were misclassified, with the model predicting the opposite class from reality. ๐ This issue appeared during the initial training phase. Also, version discrepancies (e.g., TensorFlow 2.12.0 vs. 2.17.0, and Keras 3.6.0) contributed to training complications.
- Increasing the image size (resolution) ๐ท for uploads can improve clarity and potentially reduce misclassification.
- For the initial training, we used our own dataset, which introduced some uncertainty and lowered the modelโs performance. We then switched to data from Kaggle, resulting in a significant improvementโup to 97.29% accuracy. ๐คฉ
This document consolidates the essential information regarding the Banana Tester ๐ project. All references and data links are retained, and the text has been translated into English to provide broader accessibility.

