This project implements a BERT-based chatbot for FAST University using fine-tuned BERT instead of traditional bag-of-words approach.
- BERT Fine-tuning: Uses BERT (Bidirectional Encoder Representations from Transformers) for better understanding of context and semantics
- Improved Accuracy: Better performance compared to bag-of-words approach
- Context Awareness: BERT understands word relationships and context better
- Fallback Support: Includes fallback to original bag-of-words method if BERT fails
- Flask Web Interface: Web-based chatbot interface
FAST-bot/
├── Model_training/
│ ├── bert_model_training.py # BERT training script
│ ├── bert_prediction.py # BERT prediction script
│ ├── requirements_bert.txt # BERT training dependencies
│ └── chatbot_intents.json # Training data
├── Flask_application/
│ ├── app.py # Flask web app
│ ├── utils_bert.py # BERT-based utilities
│ ├── requirements_bert.txt # Flask BERT dependencies
│ └── templates/
│ └── index.html # Web interface
└── model/ # Saved models (created after training)
For training:
cd Model_training
pip install -r requirements_bert.txtFor Flask application:
cd Flask_application
pip install -r requirements_bert.txtimport nltk
nltk.download('punkt')
nltk.download('wordnet')
nltk.download('omw-1.4')cd Model_trainingpython bert_model_training.pyThis will:
- Load your
chatbot_intents.jsondata - Fine-tune a BERT model for intent classification
- Save the trained model to
model/directory - Save tokenizer, label encoder, and label mapping
You can modify these parameters in bert_model_training.py:
- Learning Rate:
lr=2e-5(default) - Batch Size:
batch_size=16(default) - Epochs:
num_epochs=3(default) - Max Sequence Length:
max_len=128(default)
After training, you'll have:
model/bert_chatbot_model.pth- Model weightsmodel/label_encoder.pkl- Label encodermodel/label_mapping.pkl- Label mappingmodel/config.json- BERT configurationmodel/vocab.txt- BERT vocabularymodel/tokenizer_config.json- Tokenizer configuration
cd Model_training
python bert_prediction.pyThis will test the model with sample questions and show:
- Predicted intent
- Confidence score
- Generated response
from bert_prediction import BERTChatbotPredictor
predictor = BERTChatbotPredictor()
response = predictor.get_response("What is the admission process?")
print(response)cd Flask_applicationIf you want to use BERT instead of bag-of-words, update app.py:
# Change this line:
from utils import get_response, predict_class
# To this:
from utils_bert import get_response, predict_classpython app.pyOpen your browser and go to: http://localhost:5000
The BERT model looks for files in the model/ directory. Make sure the path is correct:
# In utils_bert.py
model_path='../model/' # Relative to Flask_application/Adjust the confidence threshold for predictions:
# In utils_bert.py
error_threshold=0.25 # Default thresholdThe model automatically uses GPU if available, otherwise CPU:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')- Better Context Understanding: Understands word relationships
- Higher Accuracy: Generally better performance on intent classification
- Semantic Understanding: Better at understanding synonyms and paraphrases
- Transfer Learning: Leverages pre-trained language knowledge
- Faster Inference: Lighter computational requirements
- Smaller Model Size: Less storage space
- Simplicity: Easier to understand and debug
-
CUDA Out of Memory
- Reduce batch size in training
- Use CPU instead of GPU
- Reduce max sequence length
-
Model Loading Errors
- Ensure all model files are in the correct directory
- Check file paths in the code
- Verify model was trained successfully
-
Import Errors
- Install all required dependencies
- Check Python version compatibility
- Ensure transformers and torch versions are compatible
If BERT fails to load, the system automatically falls back to the original bag-of-words method. Check the console for error messages.
- Add new intent to
chatbot_intents.json - Retrain the BERT model
- Restart the Flask application
Edit the responses in chatbot_intents.json and restart the application.
You can use different BERT variants by changing the model name:
# In bert_model_training.py
model_name = 'bert-base-uncased' # Default
# Other options:
# model_name = 'bert-large-uncased'
# model_name = 'distilbert-base-uncased'- Python 3.7+
- PyTorch 1.9+
- Transformers 4.20+
- Flask 2.0+
- CUDA (optional, for GPU acceleration)
This project is for educational purposes. Please ensure you comply with BERT's license terms from Hugging Face.