Python streaming (microphone) recognition example #17

anatol-grabowski · 2023-07-01T00:25:04Z

Hey, nice job on both this project and Live Captions!

Not clear though how to go about streaming recognition in python. Or if it is even possilble. Example needed.

anatol-grabowski · 2023-07-01T01:01:59Z

Here is what I've tried:

from typing import List
import sys
import april_asr as april
import queue
import sounddevice as sd

def asr_cb(result_type: april.Result, tokens: List[april.Token]):
    """Simple handler that concatenates all tokens and prints it"""
    prefix = "."
    if result_type == april.Result.FINAL_RECOGNITION:
        prefix = "@"
    elif result_type == april.Result.PARTIAL_RECOGNITION:
        prefix = "-"

    string = ""
    for token in tokens:
        string += token.token

    print(f"{prefix}{string}")

model = april.Model(sys.argv[1])
print("Name: " + model.get_name())
print("Description: " + model.get_description())
print("Language: " + model.get_language())
session = april.Session(model, asr_cb, asynchronous=True)

def audio_cb(indata, frames, time, status):
    session.feed_pcm16(bytes(indata))

def run(device: int) -> None:
    with sd.RawInputStream(samplerate=16000, blocksize = 8000, device=device, dtype='int16', channels=1, callback=audio_cb):
        while True:
            pass

    session.feed_pcm16(data)
    session.flush()

def main():
    args = sys.argv
    if len(args) != 3:
        print("Usage: " + args[0] + " /path/to/model.april 5 # 5 - sound device number")
    else:
        run(args[1], int(args[2]))

if __name__ == "__main__":
    main()

But my understing of what I'm doing is quite basic. Unsuprisingly no luck so far:

> pipenv run python main.py /home/anatoly/Downloads/april-english-dev-01110_en.april 5
Name: April English Dev-01110
Description: Punctuation + Numbers 23a3
Language: en
libapril: (/home/runner/work/april-asr/april-asr/src/proc_thread.c:54) [WARNING] Failed to initialize cnd_t
libapril: (/home/runner/work/april-asr/april-asr/src/proc_thread.c:76) [ERROR] Failed to lock mutex in pt_raise!
libapril: (/home/runner/work/april-asr/april-asr/src/proc_thread.c:82) [ERROR] Failed to unlock mutex in pt_raise!

abb128 · 2023-07-01T05:26:15Z

Can you try the latest python build from https://github.com/abb128/april-asr/actions/runs/5105158229 and let me know if it still happens?

anatol-grabowski · 2023-07-22T15:07:03Z

I have found out that the asynchronous flag was not necessary for my purposes.
Checked, works with asynchronous=True flag with "april_asr-0.0.3-py3-none-manylinux_2_31_x86_64.whl" from the link above.
Uninstalled april-asr and reinstalled from pypi - doesn't work with asynchronous=True flag (errors above).

So, the issue doesn't happen withe latest build from the link above.

anatol-grabowski · 2023-07-22T15:11:03Z

Here is the updated example code that works with the microphone:

from typing import List
import sys
import april_asr as april
import sounddevice as sd
import numpy as np
import time

def asr_cb(result_type: april.Result, tokens: List[april.Token]):
    prefix = "."
    if result_type == april.Result.FINAL_RECOGNITION:
        prefix = "@"
    elif result_type == april.Result.PARTIAL_RECOGNITION:
        prefix = "-"

    string = ""
    for token in tokens:
        string += token.token

    print(f"{prefix}{string}")

model = april.Model('/home/anatoly/Downloads/april-english-dev-01110_en.april')
print("Name: " + model.get_name())
print("Description: " + model.get_description())
print("Language: " + model.get_language())
session = april.Session(model, asr_cb)


duration = 10  # seconds
samplerate = 16000 # samples/second
channels = 1
shape = (int(samplerate * duration), channels)
dtype = np.int16

def audio_callback(indata, frames, times, status):
    session.feed_pcm16(bytes(indata))

stream = sd.InputStream(samplerate=samplerate, channels=channels, dtype=dtype, callback=audio_callback)
stream.start()
sd.sleep(duration * 1000)
stream.stop()
stream.close()
session.flush()

# play the recorded audio to make sure that it is being recorded correctly
sd.play(buff, samplerate=samplerate)
sd.wait()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python streaming (microphone) recognition example #17

Python streaming (microphone) recognition example #17

anatol-grabowski commented Jul 1, 2023

anatol-grabowski commented Jul 1, 2023

abb128 commented Jul 1, 2023

anatol-grabowski commented Jul 22, 2023

anatol-grabowski commented Jul 22, 2023

Python streaming (microphone) recognition example #17

Python streaming (microphone) recognition example #17

Comments

anatol-grabowski commented Jul 1, 2023

anatol-grabowski commented Jul 1, 2023

abb128 commented Jul 1, 2023

anatol-grabowski commented Jul 22, 2023

anatol-grabowski commented Jul 22, 2023