Effective communication is essential for human interaction, yet individuals with hearing impairments and speaking difficulties often face significant challenges. The ability to recognize and translate sign language in real time can bridge the communication gap between those who do not know sign language and those who rely on it.
This study explores various sign language conventions prevalent in Nigeria and beyond, including different phonetic and semantic structures, and the messages they convey. It covers languages such as American Sign Language (ASL), Bura Sign Language (BSL), Yoruba Sign Language (YSL), Hausa Sign Language, and Adamorobe Sign Language (Ghana).
To address the challenge of real-time sign language recognition and translation, this study utilized an Object-Oriented Programming (OOP) methodology. The system implements an ensemble learning-based approach combining Bi-Directional Convolutional Neural Networks (Bi-CNN) and Bidirectional Encoder Representations from Transformers (BERT) for sign language-to-text translation.
The Bi-CNN architecture processes video sequences by analyzing temporal patterns in both forward and backward directions, capturing the full context of hand movements and gestures. Extracted visual features are then fed into BERT, which applies contextual language understanding to generate coherent English text output.
This dual-encoder mechanism—combining spatial-temporal feature extraction with transformer-based language modeling—ensures both high performance and robustness in translating sign language gestures into written language.
Community data gathering, validation, refinement, and analysis techniques were employed to create a reliable and diverse dataset. The proposed model achieved 98.7% accuracy, 97.6% precision, and an F1-Score of 98.2%.
While the study examined multiple sign language systems for comparative analysis, the implementation focused specifically on ASL datasets (both images and videos), with validation through community feedback from language experts.
The system demonstrated high reliability and cost-effectiveness, especially in real-time applications, making it suitable for integration into mobile device cameras for hand gesture recognition (HGR).
project/
│
├── kaggle_dataset/
├── manual_data_collection/
├── esemble_method/
├── realtime_recognition/
├── classification_report/
│ ├── classification_report.png
│ ├── confusion_matrix.png
│ └── confusion_matrix_normalized.png
│
├── training_history.png
├── app.py # Main file
└── README.md
git clone https://github.com/chibuezedev/Signa.git
cd SignaMake sure you have Python 3.9+ and pip installed, then:
pip install -r requirements.txtpython app.pyYou'll get an interactive menu in your terminal:
- Download/Setup Kaggle Dataset
- Collect Manual Data (Single Letter)
- Collect Manual Data (Full Alphabet)
- Train Model
- Run Real-Time Recognition
- Exit
from asl_workflow import quick_collect_single_letter, quick_train_kaggle_only, quick_run_recognition
# Collect 100 samples for letter 'A'
quick_collect_single_letter('A', 100)
# Train on Kaggle data only
quick_train_kaggle_only(50)
# Run recognition with trained model
quick_run_recognition('best_asl_model.h5')The system integrates:
- Bi-Directional CNN (Bi-CNN) → for spatial-temporal hand gesture analysis
- BERT Transformer → for contextual English text generation
- Ensemble Learning → combines predictions from both architectures
Future work could explore:
- Expanding the system's language coverage by incorporating more regional sign languages
- Developing more sophisticated models for continuous signing and complex sentence structures
- Implementing text-to-sign language generation for two-way communication
- Accuracy: 98.7%
- Precision: 97.6%
- F1-Score: 98.2%
If you use this work for your research, please cite:
@project{chibueze2025sign,
title={Development of an Ensemble-Based Bi-Directional CNN and BERT System for Real-Time Sign Language Recognition},
author={Chibueze, Paul},
year={2025}
}Special thanks to the deaf community members and sign language experts who contributed to data validation and system refinement.