IT service ticket email classifier - an experiment to see if IT support emails could be triaged effectively using partially synthetic data to train an XLNetTokenizer + Inference model. Minimal real IT support data utilised (Service locations); everything else (subject lines, email contents, etc. -- taken from open source datasets and generated using python and Faker).
- Docker and Docker Compose installed
- Git
# Clone the repository
git clone https://github.com/enzojoly/email-triage-classifier.git
cd email-triage-classifier
# Launch JupyterLab environment
docker compose -f jupyter.yaml up -dJupyterLab will be available at http://localhost:8888
-
Model Training Pipeline XLNet model training workflow for email classification
-
Inference Testing Testing and validation of the trained classifier on service ticket predictions
-
Data Generation Demo Synthetic data enrichment using Faker library for training data augmentation
notebooks/: Jupyter notebooks for training, inference, and data generationprocessed_data/: Enhanced datasets and service keyword mappingsraw_data/: Original ticket data and email sourcesrequirements.txt: Python dependenciesDockerfile&jupyter.yaml: Containerized development environment