Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
goodmike31 authored Mar 10, 2024
1 parent cb8c6ad commit cac9299
Showing 1 changed file with 33 additions and 19 deletions.
52 changes: 33 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,44 @@
Polish ASR speech datasets survey and catalog.


Important - before getting familiar with the Polish ASR speech data survey and catalog, please consider completing the short feedback [form](https://forms.gle/FL1obRfYMvgHVoEu5).<br>
Your feedback will help to assess the state of Polish ASR datasets from the community perspective and improve the design of the [Polish ASR speech datasets catalog.](https://docs.google.com/spreadsheets/d/181EDfwZNtHgHFOMaKNtgKssrYDX4tXTJ9POMzBsCRlI/edit?usp=sharing) <br>
Each response is awarded by donation of 50 PLN to the chosen charity organization. Thank you!
[Feedback form link](https://forms.gle/FL1obRfYMvgHVoEu5)

# Polish ASR speech data survey goals
- Organization of publicly available information about Polish ASR speech datasets into the catalog.
- Providing the ASR community with a collectively maintained, up-to-date source of information about available datasets.
- Identification of ASR speech datasets for Polish, which are freely available for research and commercial purposes.
<br>

# Request for feedback
Before getting familiar with the Polish ASR speech data survey and catalog, please consider completing the short (5 min) feedback [form](https://forms.gle/FL1obRfYMvgHVoEu5).<br>
Your feedback will help to assess the state of Polish ASR datasets from the community perspective.
Each response is awarded by donation of 50 PLN to the chosen charity organization. Thank you!
[Feedback form link](https://forms.gle/FL1obRfYMvgHVoEu5)

# How to cite?
If you use the raw data from the catalog, please cite.
```bibtex
@Misc{pl-asr-speech-data-survey,
author = {Micha{\l} Junczyk},
title = {Polish ASR speech data catalog},
howpublished = {Github},
year = {2023},
url = {https://github.com/goodmike31/pl-asr-speech-data-survey}
}
```
If you use the survey results, please cite the corresponding article:
```bibtex
@article{Junczyk+2024+27+52, <br>
url = {https://doi.org/10.1515/psicl-2023-0019}, <br>
title = {A survey of Polish ASR speech datasets}, <br>
author = {Micha{\l} Junczyk},<br>
pages = {27--52},<br>
volume = {60},<br>
number = {1},<br>
journal = {Poznan Studies in Contemporary Linguistics},<br>
doi = {doi:10.1515/psicl-2023-0019},<br>
year = {2024},<br>
lastchecked = {2024-03-10}<br>
}
```

# Survey results:
- **Fifty-three datasets (53)** intended for Polish ASR development were identified based on the information available in the public domain.
- **Fourty-four (44)** out of 53 identified datasets are accessible.
Expand Down Expand Up @@ -42,19 +69,6 @@ Each response is awarded by donation of 50 PLN to the chosen charity organizatio
1. Open the [catalog](https://docs.google.com/spreadsheets/d/181EDfwZNtHgHFOMaKNtgKssrYDX4tXTJ9POMzBsCRlI/edit?usp=sharing)
2. Set filter on **Usage cost** column to **free**

# How to cite this resource?
@article{Junczyk+2024+27+52, <br>
url = {https://doi.org/10.1515/psicl-2023-0019}, <br>
title = {A survey of Polish ASR speech datasets}, <br>
author = {Michał Junczyk},<br>
pages = {27--52},<br>
volume = {60},<br>
number = {1},<br>
journal = {Poznan Studies in Contemporary Linguistics},<br>
doi = {doi:10.1515/psicl-2023-0019},<br>
year = {2024},<br>
lastchecked = {2024-03-10}<br>
}
<br>
# Addendum - Survey design

Expand Down

0 comments on commit cac9299

Please sign in to comment.