You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the kaldi format transcription files, it seems that sometimes there are insertions of numbers that are not present in the audio. For example:
medium/2149/mark_wnt_mp_0808_librivox_64kb_mp3/mark_2_weymouth_64kb_39
008 028 JOHN THE BAPTIST THEY REPLIED BUT OTHERS SAY ELIJAH AND OTHERS THAT IT IS
ONE OF THE PROPHETS 008 029 THEN HE ASKED THEM POINTEDLY BUT YOU YOURSELVES
WHO DO YOU SAY THAT I AM
Above, 008 028 and 008 029 seem to be Bible verse numbers that are in the original text but not actually read. I have verified this by listening to the audio sample.
The text was updated successfully, but these errors were encountered:
In the kaldi format transcription files, it seems that sometimes there are insertions of numbers that are not present in the audio. For example:
medium/2149/mark_wnt_mp_0808_librivox_64kb_mp3/mark_2_weymouth_64kb_39
008 028 JOHN THE BAPTIST THEY REPLIED BUT OTHERS SAY ELIJAH AND OTHERS THAT IT IS
ONE OF THE PROPHETS 008 029 THEN HE ASKED THEM POINTEDLY BUT YOU YOURSELVES
WHO DO YOU SAY THAT I AM
Above, 008 028 and 008 029 seem to be Bible verse numbers that are in the original text but not actually read. I have verified this by listening to the audio sample.
Em... possible. will see if there is any bug in the alignment tools.
They are all over the place, i think this is a page numbers or something like this, sometimes it is even roman numbers, sometimes it is in brackets, sometimes it is just as is.
Emm,We filter the segments according to the levenshtien distance between original text and transcript text, when the segment is long, this kind of insertions may not affect the whole distance, I mean the distance is sitll below the given threshold. Currently, have not figured out how to fix this bug.
In the kaldi format transcription files, it seems that sometimes there are insertions of numbers that are not present in the audio. For example:
Above,
008 028
and008 029
seem to be Bible verse numbers that are in the original text but not actually read. I have verified this by listening to the audio sample.The text was updated successfully, but these errors were encountered: