This project demonstrates how to classify banana ripeness (unripe, ripe, overripe, rotten) using a deep learning model built on top of MobileNetV2. By applying data augmentation, splitting data into training, validation, and testing sets, and fine-tuning hyperparameters, the system achieves high accuracy in predicting the ripeness level of bananas.
- 🌐 Data Source: Kaggle dataset and custom-collected images covering four ripeness categories.
- 🧹 Data Cleaning: Organizing images into separate folders by ripeness level.
- 🔄 Data Transformation: Standardizing image format (
.JPG) for uniform processing.
- 🤖 MobileNetV2 Pre-trained on ImageNet, then fine-tuned for this specific 4-class classification.
- 🤸 Data Augmentation: Rotation, zoom, shifting, shearing, flipping for enhanced robustness.
- 📈 Hyperparameters:
rotation_range=100,epochs=13,optimizer='adam', and more.
- 🎯 Evaluation: Achieved up to 97.29% accuracy.
- ⏱️ Validation Loss Monitoring with EarlyStopping to avoid overfitting.
- 🐞 Identified Deficiencies: Misclassification issues and version discrepancies (TensorFlow & Keras).
- 🔧 Adjustments: Switching data sources, refining image resolutions, ensuring environment consistency.
- ⚡ Potential Real-time Web Deployment for immediate ripeness detection and minimal human intervention.
- 🌐 Further Extensions: Integration with IoT sensors (temperature/humidity) to predict optimal harvest windows.
- 🐍 Python 3.x
- 🤖 TensorFlow 2.x / Keras
- 🏗 NumPy & Pandas for data handling
- 📊 Matplotlib & Seaborn for visualizations
- 🧪 Sklearn for data splitting and performance metrics
- 🚀 GPU Support (optional but recommended for faster training)
- Currently, Thailand is experiencing a banana 🍌 shortage, leading to reduced banana production. Therefore, it is crucial to maximize the use of the limited banana supply. The “Banana Tester” 🍌 was created to evaluate the ripeness level of bananas, as relying solely on human visual inspection of peel color and softness can be highly uncertain—subject to individual experience and environmental conditions. Moreover, fluctuating weather significantly affects the ripening rate, making it difficult to predict the ideal harvest time. Overripe or under-ripe bananas cannot be sold, resulting in economic losses, potential taste changes, and higher spoilage risks. Additionally, inconsistent ripeness levels among bananas increase labor and time spent sorting.
-
Research References: Google Drive Link
-
Strengths and Weaknesses of Previous Research:
From the studies reviewed, there are diverse methods of data collection, and the approaches for classifying banana ripeness are highly detailed. However, such methods can be time-consuming for data gathering. In contrast, our current experimental project involves faster data collection but still achieves comparable results. -
Identified Gap:
Existing research lacks a user-friendly program or website capable of real-time 🍌 analysis of banana ripeness. Addressing this gap is the key motivation for our system’s development.
- Connecting Previous Research with Our Approach:
Both the reviewed studies and our current project share a similar objective—solving the uncertainty in human-based banana 🍌 ripeness classification. Such misclassification often leads to waste from prematurely rotten bananas. Therefore, the main aim of these studies and our project is to increase the accuracy of banana ripeness classification by leveraging AI 🤖. Ultimately, this extends to creating a real-time web-based tool for easy and immediate use.
- Bananas with four different ripeness levels (unripe, ripe, overripe, rotten) were obtained from:
- Kaggle Dataset 🍌
- Google Drive Link 🍌 (the drive link used for training)
- Images are categorized into folders according to their ripeness level.
- All images used in this project are standardized to
.JPGformat. Each ripeness category is placed in its corresponding folder.
- Supervised Learning: Classification
We employ a classification approach to distinguish between the four banana ripeness levels.
rotation_range=100↪ Randomly rotate images within ±100 degreeszoom_range=0.05↪ Randomly zoom images up to 5%width_shift_range=0.5↪ Shift images horizontally by up to 50% of the widthheight_shift_range=0.5↪ Shift images vertically by up to 50% of the heightshear_range=0.15↪ Apply shear transformations up to 15%horizontal_flip=True↪ Enable horizontal flippingfill_mode="nearest"↪ Fill in empty pixels using the nearest value
validation_split=0.2↪ Reserve 20% of the data for validation; 80% is used for trainingtrain_size=0.7↪ 70% of the dataset is used for training, and 30% for testing
- MobileNetV2 Pre-trained Model
input_shape=(224, 224, 3)↪ Input images have a resolution of 224×224×3include_top=False↪ The final classification layers of MobileNetV2 are excluded
- Dense Layers
- 1 dense layer with 64 neurons,
activation='relu' - 1 dense layer with 32 neurons,
activation='relu' - Final dense layer with 4 neurons (
activation='softmax') for the 4 classes
- 1 dense layer with 64 neurons,
- Dropout Layers
- Dropout rate of 0.5 (50%) to help mitigate overfitting
optimizer='adam'↪ Use Adam optimizer to adjust model weightsloss='binary_crossentropy'↪ Use binary cross-entropy as the loss function
batch_size=32↪ Number of examples processed in each weight update stepepochs=13↪ Total number of training epochs- EarlyStopping
monitor='val_loss'↪ Monitor validation losspatience=3↪ Stop training if validation loss does not improve after 3 consecutive epochsrestore_best_weights=True↪ Revert to the best weights obtained during training
shuffle=False↪ Disable shuffling of training samples (in certain generator setups)seed=0↪ Fix the random seed for reproducible results
- In this project, we use MobileNetV2 (pre-trained on ImageNet) and fine-tune it for banana ripeness classification. 🍌
- Accuracy: 97.29% 🎉
- After testing, our model achieved an accuracy of 97.29%. This score represents the proportion of correctly predicted outcomes compared to the total number of predictions, indicating high model performance. 🚀
- Some instances were misclassified, with the model predicting the opposite class from reality. 😅 This issue appeared during the initial training phase. Also, version discrepancies (e.g., TensorFlow 2.12.0 vs. 2.17.0, and Keras 3.6.0) contributed to training complications.
- Increasing the image size (resolution) 📷 for uploads can improve clarity and potentially reduce misclassification.
- For the initial training, we used our own dataset, which introduced some uncertainty and lowered the model’s performance. We then switched to data from Kaggle, resulting in a significant improvement—up to 97.29% accuracy. 🤩
This document consolidates the essential information regarding the Banana Tester 🍌 project. All references and data links are retained, and the text has been translated into English to provide broader accessibility.

