Sign Language Recognition and Translation System

Ensemble Learning with Bi-Directional CNN and BERT

Abstract

Effective communication is essential for human interaction, yet individuals with hearing impairments and speaking difficulties often face significant challenges. The ability to recognize and translate sign language in real time can bridge the communication gap between those who do not know sign language and those who rely on it.

This study explores various sign language conventions prevalent in Nigeria and beyond, including different phonetic and semantic structures, and the messages they convey. It covers languages such as American Sign Language (ASL), Bura Sign Language (BSL), Yoruba Sign Language (YSL), Hausa Sign Language, and Adamorobe Sign Language (Ghana).

To address the challenge of real-time sign language recognition and translation, this study utilized an Object-Oriented Programming (OOP) methodology. The system implements an ensemble learning-based approach combining Bi-Directional Convolutional Neural Networks (Bi-CNN) and Bidirectional Encoder Representations from Transformers (BERT) for sign language-to-text translation.

The Bi-CNN architecture processes video sequences by analyzing temporal patterns in both forward and backward directions, capturing the full context of hand movements and gestures. Extracted visual features are then fed into BERT, which applies contextual language understanding to generate coherent English text output.

This dual-encoder mechanism—combining spatial-temporal feature extraction with transformer-based language modeling—ensures both high performance and robustness in translating sign language gestures into written language.

Community data gathering, validation, refinement, and analysis techniques were employed to create a reliable and diverse dataset. The proposed model achieved 98.7% accuracy, 97.6% precision, and an F1-Score of 98.2%.

While the study examined multiple sign language systems for comparative analysis, the implementation focused specifically on ASL datasets (both images and videos), with validation through community feedback from language experts.

The system demonstrated high reliability and cost-effectiveness, especially in real-time applications, making it suitable for integration into mobile device cameras for hand gesture recognition (HGR).

🧩 Project Structure

project/
│
├── kaggle_dataset/
├── manual_data_collection/
├── esemble_method/
├── realtime_recognition/
├── classification_report/
│   ├── classification_report.png
│   ├── confusion_matrix.png
│   └── confusion_matrix_normalized.png
│
├── training_history.png
├── app.py                  # Main file
└── README.md

🚀 Setup & Usage

Clone Repository

git clone https://github.com/chibuezedev/Signa.git
cd Signa

Install Dependencies

Make sure you have Python 3.9+ and pip installed, then:

pip install -r requirements.txt

Run the Workflow

python app.py

You'll get an interactive menu in your terminal:

Download/Setup Kaggle Dataset
Collect Manual Data (Single Letter)
Collect Manual Data (Full Alphabet)
Train Model
Run Real-Time Recognition
Exit

Quick Functions

from asl_workflow import quick_collect_single_letter, quick_train_kaggle_only, quick_run_recognition

# Collect 100 samples for letter 'A'
quick_collect_single_letter('A', 100)

# Train on Kaggle data only
quick_train_kaggle_only(50)

# Run recognition with trained model
quick_run_recognition('best_asl_model.h5')

Model Architecture

The system integrates:

Bi-Directional CNN (Bi-CNN) → for spatial-temporal hand gesture analysis
BERT Transformer → for contextual English text generation
Ensemble Learning → combines predictions from both architectures

Future Work

Future work could explore:

Expanding the system's language coverage by incorporating more regional sign languages
Developing more sophisticated models for continuous signing and complex sentence structures
Implementing text-to-sign language generation for two-way communication

Performance Metrics

Accuracy: 98.7%
Precision: 97.6%
F1-Score: 98.2%

Citation

If you use this work for your research, please cite:

@project{chibueze2025sign,
  title={Development of an Ensemble-Based Bi-Directional CNN and BERT System for Real-Time Sign Language Recognition},
  author={Chibueze, Paul},
  year={2025}
}

Author

Chibueze Paul
GitHub | Email

Acknowledgments

Special thanks to the deaf community members and sign language experts who contributed to data validation and system refinement.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.idea		.idea
AtoZ		AtoZ
Data		Data
Model		Model
app		app
depreciated		depreciated
.gitignore		.gitignore
README.md		README.md
qodana.yaml		qodana.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sign Language Recognition and Translation System

Ensemble Learning with Bi-Directional CNN and BERT

Abstract

🧩 Project Structure

🚀 Setup & Usage

Clone Repository

Install Dependencies

Run the Workflow

Quick Functions

Model Architecture

Future Work

Performance Metrics

Citation

Author

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

chibuezedev/Signa

Folders and files

Latest commit

History

Repository files navigation

Sign Language Recognition and Translation System

Ensemble Learning with Bi-Directional CNN and BERT

Abstract

🧩 Project Structure

🚀 Setup & Usage

Clone Repository

Install Dependencies

Run the Workflow

Quick Functions

Model Architecture

Future Work

Performance Metrics

Citation

Author

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages