reintroduce opus on VAD, change frame size according to firmware v1.0, change realtime resolution for transcribe #624

0xzre · 2024-08-19T11:34:32Z

#518
The encoding in Friend firmware code v1.0 shows that it's using frame size of 160 (10ms). I have not tested on Friend cause I don't have the device.
Changing the real-time resolution to standard to 20ms, should theoretically reduce server load.
Thank you!

mdmohsin7

It still doesn't work, there's no transcript.
Also there's this warning and I am not sure if it is something to be worried about?

backend/routers/transcribe.py:102: UserWarning: The given buffer is not writable, and PyTorch does not support non-writable tensors. This means you can write to the underlying (supposedly non-writable) buffer using the tensor. You may want to copy the buffer to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_new.cpp:1530.)
  samples = torch.frombuffer(decoded_opus, dtype=torch.int16).float() / 32768.0

0xzre · 2024-08-20T01:21:03Z

It still doesn't work, there's no transcript. Also there's this warning and I am not sure if it is something to be worried about?

backend/routers/transcribe.py:102: UserWarning: The given buffer is not writable, and PyTorch does not support non-writable tensors. This means you can write to the underlying (supposedly non-writable) buffer using the tensor. You may want to copy the buffer to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_new.cpp:1530.)
  samples = torch.frombuffer(decoded_opus, dtype=torch.int16).float() / 32768.0

That is the error that is related how I should handle the buffer in Opus, and I'll solve that soon.

mdmohsin7

It does transcribes, but the problem is it misses a lot of segments (way more than pcm with vad).
The websocket disconnects more frequently
Also the transcription is quite slow for both pcm and opus

…s error message

0xzre · 2024-08-21T20:26:01Z

Sounds like server get heavier.
Miss more transcribe & slower -> Incoming bytes take long time to process on VAD, increasing delay to DG.
Websockets more dc -> Ping/pong doesn't get though or processed on time, because high cpu usage on VAD

My solution :

Use onnx runtime for VAD
Decrease window for VAD, 4x lesser now

Any feedback or opinion is appreciated. Thanks!

0xzre · 2024-08-22T19:00:48Z

Changes

More handling on socket2 data, which is always used when Opus (Friend mic, not phone) is involved. It target to solve socket disconnected err, while keeping the PCM still working.
Any feedback is welcomed, thank you :)

0xzre · 2024-08-25T11:54:01Z

@josancamon19 @mdmohsin7 Already merged with main branch, giving better result on case of using speech profile. Please review, thanks!

josancamon19 · 2024-08-28T22:51:16Z

https://share.icloud.com/photos/06dFrjm9Q_RrsvZO5VLScWGLg

Clearly doesn't work, for next review, please submit videos of it working through the app

0xzre · 2024-08-31T06:31:47Z

@josancamon19 @mdmohsin7 Drive link: https://drive.google.com/drive/folders/1h1nbyLAaVt72Wwy-yO_5C8L5_re17ptI?usp=sharing Please review thanks!

0xzre · 2024-09-01T18:46:05Z

I have added more testing, which now is for a lecture video (more convertation alike situation) in "test 1" folder. also provided the pcm transcribe from playstore app (no VAD) for the ground truth. The result is, the latency is indistinguishable, accuracy very improved. VAD opus usable

0xzre · 2024-09-13T03:29:26Z

dude @josancamon19

josancamon19 · 2024-09-26T18:17:28Z

Moving PR to #922

reintroduce opus, change frame size according to firmware v1.0

1d2fc85

mdmohsin7 self-requested a review August 19, 2024 15:58

mdmohsin7 requested changes Aug 19, 2024

View reviewed changes

change realtime resolution to 20ms, fix error log on Opus

ab33cf3

0xzre changed the title ~~reintroduce opus on VAD, change frame size according to firmware v1.0~~ reintroduce opus on VAD, change frame size according to firmware v1.0, change realtime resolution for transcribe Aug 20, 2024

0xzre added 2 commits August 20, 2024 12:42

remove minimum buffer size to send DG

32a2ee8

fix byte

7db74fd

mdmohsin7 self-requested a review August 20, 2024 17:28

mdmohsin7 requested changes Aug 20, 2024

View reviewed changes

reduce window size to be VAD-ed, use VAD onnx format & runtime, fix w…

21b1633

…s error message

fix socket2 handing incoming byte in speech profile duration

9f0a85a

0xzre added 3 commits August 25, 2024 18:47

Merge branch 'main' of https://github.com/0xzre/Friend into ws-vad-fix

79fa836

Merge branch 'main' of https://github.com/0xzre/Friend into ws-vad-fix

15235aa

Merge branch 'main' of https://github.com/0xzre/Friend into ws-vad-fix

4f42f18

mostly performance related optimization

e8c1b29

more frequent VAD on Opus is possible without lag compensation

83b11b7

josancamon19 closed this Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reintroduce opus on VAD, change frame size according to firmware v1.0, change realtime resolution for transcribe #624

reintroduce opus on VAD, change frame size according to firmware v1.0, change realtime resolution for transcribe #624

0xzre commented Aug 19, 2024 •

edited

Loading

mdmohsin7 left a comment

0xzre commented Aug 20, 2024 •

edited

Loading

mdmohsin7 left a comment

0xzre commented Aug 21, 2024

0xzre commented Aug 22, 2024

0xzre commented Aug 25, 2024

josancamon19 commented Aug 28, 2024

0xzre commented Aug 31, 2024

0xzre commented Sep 1, 2024

0xzre commented Sep 13, 2024

josancamon19 commented Sep 26, 2024

reintroduce opus on VAD, change frame size according to firmware v1.0, change realtime resolution for transcribe #624

reintroduce opus on VAD, change frame size according to firmware v1.0, change realtime resolution for transcribe #624

Conversation

0xzre commented Aug 19, 2024 • edited Loading

mdmohsin7 left a comment

Choose a reason for hiding this comment

0xzre commented Aug 20, 2024 • edited Loading

mdmohsin7 left a comment

Choose a reason for hiding this comment

0xzre commented Aug 21, 2024

0xzre commented Aug 22, 2024

Changes

0xzre commented Aug 25, 2024

josancamon19 commented Aug 28, 2024

0xzre commented Aug 31, 2024

0xzre commented Sep 1, 2024

0xzre commented Sep 13, 2024

josancamon19 commented Sep 26, 2024

0xzre commented Aug 19, 2024 •

edited

Loading

0xzre commented Aug 20, 2024 •

edited

Loading