Skip to content

anustupdas/document-subject-classification

Repository files navigation

Document Subject Text Classification

This repository contains code and supplementary materials which are required to train and evaluate a Hugging face Bert based model used for text classification task. The task it to classify the subject label given a document. This is a Pytorch implementation.

Codebase:

https://github.com/anustupdas/document-sucject-classification.git

Subject datasets:

https://drive.google.com/drive/folders/11-dfGZkqZl-LRgo9cCSSsMyVxFgAEMnW?usp=sharing

Trained Models:

https://drive.google.com/drive/folders/11-dfGZkqZl-LRgo9cCSSsMyVxFgAEMnW?usp=sharing

Link to Notebooks:

Raw data exploration and Base model training with Keras Notebook:

https://colab.research.google.com/drive/1WgVZg0n2BYJLMSvjguWv_5nz6jMi2F0-?usp=sharing

Final Preprocessed Training data exploration Notebook:

https://colab.research.google.com/drive/1Uw_vfjolHjcDy4Ijm9epOJnqTea4UZe4?usp=sharing

Final Model Training Evaluation Inference Notebook [Main Notebook]:

https://colab.research.google.com/drive/1zBUQ_0NfZBOn252we-QEWkSbkOct1nMc?usp=sharing

License

This project is licensed under the MIT License - see the LICENSE file for details.

Authors

About

Text classification with transformers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published