Elpis (Accelerated Transcription for Linguists)

What is Elpis?

Elpis is a tool which allows language workers with minimal computational experience to build their own speech models for use in automatically transcribing audio. It relies on the Kaldi automatic speech recognition (ASR) library. Kaldi is notorious for being difficult to build, use and navigate - even for trained computer scientists. The goal of Elpis is to expose the power of Kaldi to linguists and language workers by abstracting away much of the needless technical complexity.

Currently, the major component of Elpis is kaldi_helpers, a collection of Python and shell scripts designed to prepare data for use with Kaldi and convert between the various time-aligned transcription formats that linguists work with.

How Does It Work?

Elpis uses Docker, specifically an Ubuntu Linux image, to install Kaldi and its (many) dependencies. It also installs kaldi_helpers and the Task task runner. We have defined a number of tasks which automate many common workflows like data preparation, model creation and inferring transcriptions for new files. You can read about these tasks and how to use them here.

How Do I Use It?

Please check the wiki pages for a step-by-step guide to using Elpis on your data. If you're comfortable using Docker and are aware of the types of data required, you can also consult our more concise advanced guide.

Requirements

All the required programs should be available in the environment.

python3 + pip
git
npm

Why Is It Called Elpis?

Elpis is the Greek goddess of hope. For us Elpis represents our hope that one day everyone will have access to the power of ASR without having to interact directly with the Kaldi codebase. We've also backronymed it to stand for "Endangered Language Pipeline and Inference System," as much of our motivation for this project derives from our desire to assist in documentation and revitalisation efforts for the world's many endangered languages - including many in Australia, where most of Elpis' development has occurred.

I'm An Academic, How Do I Cite This?

This software is the product of academic research funded by the Australian Research Council Centre of Excellence for the Dynamics of Language. If you use the software or code in an academic setting, please be sure to cite it appropriately as follows:

Foley, B., Arnold, J., Coto-Solano, R., Durantin, G., Ellison, T. M., van Esch, D., Heath, S., Kratochvíl, F., Maxwell-Smith, Z., Nash, D., Olsson, O., Richards, M., San, N., Stoakes, H., Thieberger, N. & Wiles, J. (2018). Building Speech Recognition Systems for Language Documentation: The CoEDL Endangered Language Pipeline and Inference System (Elpis). In S. S. Agrawal (Ed.), The 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU) (pp. 200–204). Available on https://www.isca-speech.org/archive/SLTU_2018/pdfs/Ben.pdf.

Name		Name	Last commit message	Last commit date
Latest commit History 212 Commits
docs		docs
elpis-gui @ 11c2240		elpis-gui @ 11c2240
elpis		elpis
.gitignore		.gitignore
.gitmodules		.gitmodules
.travis.yml		.travis.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
Taskfile.yml		Taskfile.yml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
srilm-1.7.2.tar.gz		srilm-1.7.2.tar.gz
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Elpis (Accelerated Transcription for Linguists)

What is Elpis?

How Does It Work?

How Do I Use It?

Requirements

Why Is It Called Elpis?

I'm An Academic, How Do I Cite This?

About

Releases

Packages

Languages

License

yulha/elpis

Folders and files

Latest commit

History

Repository files navigation

Elpis (Accelerated Transcription for Linguists)

What is Elpis?

How Does It Work?

How Do I Use It?

Requirements

Why Is It Called Elpis?

I'm An Academic, How Do I Cite This?

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages