Audio tagging is the process of inferring descriptive labels from audio clips. This can be treated as a multi label classification task, involving the recognition and classification of audio samples in order to apply tags of various natures. To study the applications of audio tagging for real world problems, there is a need to develop a good general purpose audio tagging model. Different methods of audio preprocessing must also be explored for the same. This repository contains exploratory code/scripts for audio preprocessing and model fitting on the Freesound dataset provided by kaggle as a part of its FSD Audio Tagging Challenge (2019). In an effort to build a good pipeline for the process of audio tagging and it's applications, research is being done into multiple alternatives for the same. Post which, transfer learning can be performed using the developed model and preprocessing techniques, on applications that require real time audio processing and classification.
- Urban sounds:
- Animal sounds:
- ESC-50
- Open source dataset 1 - Animal-Sound-Dataset-Research-2019-Sri-Lanka
- Open source dataset 2 - Animal-sounds-Embedded-Classifier
- Open source dataset 3 - Animal-Sound-Classification-Using-A-Convolutional-Neural-Network