This repository contains code and supplementary materials which are required to train and evaluate a Hugging face Bert based model used for text classification task. The task it to classify the subject label given a document. This is a Pytorch implementation.
Codebase:
https://github.com/anustupdas/document-sucject-classification.git
Subject datasets:
https://drive.google.com/drive/folders/11-dfGZkqZl-LRgo9cCSSsMyVxFgAEMnW?usp=sharing
Trained Models:
https://drive.google.com/drive/folders/11-dfGZkqZl-LRgo9cCSSsMyVxFgAEMnW?usp=sharing
Raw data exploration and Base model training with Keras Notebook:
https://colab.research.google.com/drive/1WgVZg0n2BYJLMSvjguWv_5nz6jMi2F0-?usp=sharing
Final Preprocessed Training data exploration Notebook:
https://colab.research.google.com/drive/1Uw_vfjolHjcDy4Ijm9epOJnqTea4UZe4?usp=sharing
Final Model Training Evaluation Inference Notebook [Main Notebook]:
https://colab.research.google.com/drive/1zBUQ_0NfZBOn252we-QEWkSbkOct1nMc?usp=sharing
This project is licensed under the MIT License - see the LICENSE file for details.
- Anustup [email protected]