Community-oriented audio dataset profile creation

**Description**
We will want to add support for audio datasets into GuideLLM to enable the benchmarking of audio multi-modal models like whisper. Since Audio datasets are fairly limited, we should aim to structure the data by use cases in a way where developers can easily understand the context of the data and what is being benchmarked by the model. 

**User Story**
As a developer, I want to benchmark a whisper model so that I can understand performance before I move to production with different audio dataset profiles to make sure my use case (call-center summarization, translation, etc. ) can be met on my target hardware. 

**Acceptance Criteria**
- Enable support for the leading hugging face supported audio datasets: https://huggingface.co/blog/audio-datasets#a-tour-of-audio-datasets-on-the-hub 
- Create profiles for the datasets for different use cases in structured folders for the use cases 
- - Miltilingual Language Translation
- - English Speech Recognition
- - Speech Translation
- - Audio Classification
- - TBD 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Community-oriented audio dataset profile creation #84

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Community-oriented audio dataset profile creation #84

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions