-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Bug Description
I noticed that only the small whisper model works (c.f. #261).
When I switch to a larger model like the medium one, something irreversibly breaks:
(The subsequent p values are << 1)
whisper_full_with_state: auto-detected language: en (p = 0.998792)
[2026-02-20][22:59:23][handy_app_lib::managers::transcription][INFO] Transcription completed in 48994ms
[2026-02-20][22:59:23][handy_app_lib::managers::transcription][INFO] Transcription result: prob prob prob. at N! N!.
It seems to work up to the language detection (that is still correct) and breaks afterwards.
Afterwards the models produce only glibberish, even when I switch back to whisper small.
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:45:01][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 126.292029ms
[2026-02-20][22:45:01][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: en (p = 0.577720)
[2026-02-20][22:45:15][handy_app_lib::managers::transcription][INFO] Transcription completed in 11258ms
[2026-02-20][22:45:15][handy_app_lib::managers::transcription][INFO] Transcription result: blah, blah, blah, blah, test.
[2026-02-20][22:45:15][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:45:15][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:45:15][handy_app_lib::clipboard][INFO] Using ydotool for key combo
[2026-02-20][22:45:31][transcribe_rs::engines::parakeet::model][INFO] Loading quantized model from encoder-model.int8.onnx...
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'encoder-model.int8.onnx' input: name=audio_signal, type=Tensor { ty: Float32, shape:[-1, 128, -1], dimension_symbols: SymbolicDimensions(["audio_signal_dynamic_axes_1", "", "audio_signal_dynamic_axes_2"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'encoder-model.int8.onnx' input: name=length, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Loading quantized model from decoder_joint-model.int8.onnx...
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=encoder_outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["encoder_outputs_dynamic_axes_1", "", "encoder_outputs_dynamic_axes_2"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=targets, type=Tensor { ty: Int32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["targets_dynamic_axes_1", "targets_dynamic_axes_2"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=target_length, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=input_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_1_dynamic_axes_1", ""]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=input_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_2_dynamic_axes_1", ""]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Loading model from nemo128.onnx...
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'nemo128.onnx' input: name=waveforms, type=Tensor { ty: Float32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["batch_size", "N"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'nemo128.onnx' input: name=waveforms_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Loaded vocabulary with 8193 tokens, blank_idx=8192
whisper_init_from_file_with_params_no_state: loading model from '/home/john/.local/share/com.pais.handy/models/ggml-small.bin'
whisper_init_with_params_no_state: use gpu = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw = 0
whisper_init_with_params_no_state: backends = 2
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 768
whisper_model_load: n_text_head = 12
whisper_model_load: n_text_layer = 12
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 3 (small)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs = 99
whisper_model_load: Vulkan0 total size = 487.01 MB
whisper_model_load: model size = 487.01 MB
whisper_backend_init_gpu: using Vulkan backend
whisper_init_state: kv self size = 18.87 MB
whisper_init_state: kv cross size = 56.62 MB
whisper_init_state: kv pad size = 4.72 MB
whisper_init_state: compute buffer (conv) = 23.37 MB
whisper_init_state: compute buffer (encode) = 128.01 MB
whisper_init_state: compute buffer (cross) = 6.18 MB
whisper_init_state: compute buffer (decode) = 98.19 MB
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:45:41][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 132.348507ms
[2026-02-20][22:45:41][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: en (p = 0.962492)
[2026-02-20][22:45:53][handy_app_lib::managers::transcription][INFO] Transcription completed in 11451ms
[2026-02-20][22:45:53][handy_app_lib::managers::transcription][INFO] Transcription result: This is a test.
[2026-02-20][22:45:53][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:45:53][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:45:53][handy_app_lib::clipboard][INFO] Using ydotool for key combo
[2026-02-20][22:46:09][handy_app_lib::managers::model][INFO] Starting fresh download of model medium from [URL]
[2026-02-20][22:46:47][handy_app_lib::managers::model][INFO] Successfully downloaded model medium to "/home/john/.local/share/com.pais.handy/models/whisper-medium-q4_1.bin"
whisper_init_from_file_with_params_no_state: loading model from '/home/john/.local/share/com.pais.handy/models/whisper-medium-q4_1.bin'
whisper_init_with_params_no_state: use gpu = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw = 0
whisper_init_with_params_no_state: backends = 2
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 1024
whisper_model_load: n_text_head = 16
whisper_model_load: n_text_layer = 24
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 3
whisper_model_load: qntvr = 2
whisper_model_load: type = 4 (medium)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs = 99
whisper_model_load: Vulkan0 total size = 491.23 MB
whisper_model_load: model size = 491.23 MB
whisper_backend_init_gpu: using Vulkan backend
whisper_init_state: kv self size = 50.33 MB
whisper_init_state: kv cross size = 150.99 MB
whisper_init_state: kv pad size = 6.29 MB
whisper_init_state: compute buffer (conv) = 29.51 MB
whisper_init_state: compute buffer (encode) = 170.15 MB
whisper_init_state: compute buffer (cross) = 7.72 MB
whisper_init_state: compute buffer (decode) = 99.11 MB
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:46:53][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 128.179251ms
[2026-02-20][22:46:53][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:46:58][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 130.732617ms
[2026-02-20][22:46:58][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: nl (p = 0.010000)
[2026-02-20][22:47:48][handy_app_lib::managers::transcription][INFO] Transcription completed in 48429ms
[2026-02-20][22:47:48][handy_app_lib::managers::transcription][INFO] Transcription result: prob prob prob. at!!
[2026-02-20][22:47:48][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:47:48][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:47:48][handy_app_lib::clipboard][INFO] Using ydotool for key combo
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:47:53][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 126.849091ms
[2026-02-20][22:47:53][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: en (p = 0.452633)
[2026-02-20][22:48:44][handy_app_lib::managers::transcription][INFO] Transcription completed in 49879ms
[2026-02-20][22:48:44][handy_app_lib::managers::transcription][INFO] Transcription result: prob!.
[2026-02-20][22:48:44][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:48:44][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:48:44][handy_app_lib::clipboard][INFO] Using ydotool for key combo
whisper_init_from_file_with_params_no_state: loading model from '/home/john/.local/share/com.pais.handy/models/ggml-small.bin'
whisper_init_with_params_no_state: use gpu = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw = 0
whisper_init_with_params_no_state: backends = 2
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 768
whisper_model_load: n_text_head = 12
whisper_model_load: n_text_layer = 12
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 3 (small)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs = 99
whisper_model_load: Vulkan0 total size = 487.01 MB
whisper_model_load: model size = 487.01 MB
whisper_backend_init_gpu: using Vulkan backend
whisper_init_state: kv self size = 18.87 MB
whisper_init_state: kv cross size = 56.62 MB
whisper_init_state: kv pad size = 4.72 MB
whisper_init_state: compute buffer (conv) = 23.37 MB
whisper_init_state: compute buffer (encode) = 128.01 MB
whisper_init_state: compute buffer (cross) = 6.18 MB
whisper_init_state: compute buffer (decode) = 98.19 MB
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:48:57][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 129.716723ms
[2026-02-20][22:48:57][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: nl (p = 0.010000)
[2026-02-20][22:49:10][handy_app_lib::managers::transcription][INFO] Transcription completed in 11416ms
[2026-02-20][22:49:10][handy_app_lib::managers::transcription][INFO] Transcription result: Number
[2026-02-20][22:49:10][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:49:10][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:49:10][handy_app_lib::clipboard][INFO] Using ydotool for key combo
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:49:16][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 128.857267ms
[2026-02-20][22:49:16][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: nl (p = 0.010000)
[2026-02-20][22:49:29][handy_app_lib::managers::transcription][INFO] Transcription completed in 11311ms
[2026-02-20][22:49:29][handy_app_lib::managers::transcription][INFO] Transcription result: jam
[2026-02-20][22:49:29][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:49:29][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:49:29][handy_app_lib::clipboard][INFO] Using ydotool for key combo
[2026-02-20][22:49:35][transcribe_rs::engines::parakeet::model][INFO] Loading quantized model from encoder-model.int8.onnx...
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'encoder-model.int8.onnx' input: name=audio_signal, type=Tensor { ty: Float32, shape:[-1, 128, -1], dimension_symbols: SymbolicDimensions(["audio_signal_dynamic_axes_1", "", "audio_signal_dynamic_axes_2"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'encoder-model.int8.onnx' input: name=length, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Loading quantized model from decoder_joint-model.int8.onnx...
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=encoder_outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["encoder_outputs_dynamic_axes_1", "", "encoder_outputs_dynamic_axes_2"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=targets, type=Tensor { ty: Int32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["targets_dynamic_axes_1", "targets_dynamic_axes_2"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=target_length, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=input_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_1_dynamic_axes_1", ""]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=input_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_2_dynamic_axes_1", ""]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Loading model from nemo128.onnx...
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'nemo128.onnx' input: name=waveforms, type=Tensor { ty: Float32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["batch_size", "N"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'nemo128.onnx' input: name=waveforms_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Loaded vocabulary with 8193 tokens, blank_idx=8192
whisper_init_from_file_with_params_no_state: loading model from '/home/john/.local/share/com.pais.handy/models/ggml-small.bin'
whisper_init_with_params_no_state: use gpu = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw = 0
whisper_init_with_params_no_state: backends = 2
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 768
whisper_model_load: n_text_head = 12
whisper_model_load: n_text_layer = 12
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 3 (small)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs = 99
whisper_model_load: Vulkan0 total size = 487.01 MB
whisper_model_load: model size = 487.01 MB
whisper_backend_init_gpu: using Vulkan backend
whisper_init_state: kv self size = 18.87 MB
whisper_init_state: kv cross size = 56.62 MB
whisper_init_state: kv pad size = 4.72 MB
whisper_init_state: compute buffer (conv) = 23.37 MB
whisper_init_state: compute buffer (encode) = 128.01 MB
whisper_init_state: compute buffer (cross) = 6.18 MB
whisper_init_state: compute buffer (decode) = 98.19 MB
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:49:47][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 129.218878ms
[2026-02-20][22:49:47][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: nn (p = 0.624495)
[2026-02-20][22:50:51][handy_app_lib::managers::transcription][INFO] Transcription completed in 62476ms
[2026-02-20][22:50:51][handy_app_lib::managers::transcription][INFO] Transcription result: Mi
[2026-02-20][22:50:51][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:50:51][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:50:51][handy_app_lib::clipboard][INFO] Using ydotool for key combo
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:50:59][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 127.533426ms
[2026-02-20][22:50:59][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: en (p = 0.414875)
[2026-02-20][22:52:12][handy_app_lib::managers::transcription][INFO] Transcription completed in 70039ms
[2026-02-20][22:52:12][handy_app_lib::managers::transcription][INFO] Transcription result: !!!!
[2026-02-20][22:52:12][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:52:12][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:52:12][handy_app_lib::clipboard][INFO] Using ydotool for key combo
[2026-02-20][22:52:18][transcribe_rs::engines::parakeet::model][INFO] Loading quantized model from encoder-model.int8.onnx...
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'encoder-model.int8.onnx' input: name=audio_signal, type=Tensor { ty: Float32, shape:[-1, 128, -1], dimension_symbols: SymbolicDimensions(["audio_signal_dynamic_axes_1", "", "audio_signal_dynamic_axes_2"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'encoder-model.int8.onnx' input: name=length, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Loading quantized model from decoder_joint-model.int8.onnx...
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=encoder_outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["encoder_outputs_dynamic_axes_1", "", "encoder_outputs_dynamic_axes_2"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=targets, type=Tensor { ty: Int32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["targets_dynamic_axes_1", "targets_dynamic_axes_2"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=target_length, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=input_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_1_dynamic_axes_1", ""]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=input_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_2_dynamic_axes_1", ""]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Loading model from nemo128.onnx...
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'nemo128.onnx' input: name=waveforms, type=Tensor { ty: Float32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["batch_size", "N"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'nemo128.onnx' input: name=waveforms_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Loaded vocabulary with 8193 tokens, blank_idx=8192
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:52:26][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 128.302799ms
[2026-02-20][22:52:26][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
[2026-02-20][22:52:28][handy_app_lib::managers::transcription][INFO] Transcription completed in 335ms
[2026-02-20][22:52:28][handy_app_lib::managers::transcription][INFO] Transcription result: A last test.
[2026-02-20][22:52:28][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:52:28][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:52:28][handy_app_lib::clipboard][INFO] Using ydotool for key combo
whisper_init_from_file_with_params_no_state: loading model from '/home/john/.local/share/com.pais.handy/models/ggml-small.bin'
whisper_init_with_params_no_state: use gpu = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw = 0
whisper_init_with_params_no_state: backends = 2
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 768
whisper_model_load: n_text_head = 12
whisper_model_load: n_text_layer = 12
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 3 (small)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs = 99
whisper_model_load: Vulkan0 total size = 487.01 MB
whisper_model_load: model size = 487.01 MB
whisper_backend_init_gpu: using Vulkan backend
whisper_init_state: kv self size = 18.87 MB
whisper_init_state: kv cross size = 56.62 MB
whisper_init_state: kv pad size = 4.72 MB
whisper_init_state: compute buffer (conv) = 23.37 MB
whisper_init_state: compute buffer (encode) = 128.01 MB
whisper_init_state: compute buffer (cross) = 6.18 MB
whisper_init_state: compute buffer (decode) = 98.19 MB
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:52:40][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 130.696382ms
[2026-02-20][22:52:40][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: nl (p = 0.010000)
^C⏎
~/A/squashfs-root [SIGINT]> ./AppRun --debug
[2026-02-20][22:53:00][handy_app_lib::managers::history][INFO] Initializing database at "/home/john/.local/share/com.pais.handy/history.db"
[2026-02-20][22:53:00][handy_app_lib::commands][INFO] Enigo initialized successfully after permission grant
[2026-02-20][22:53:00][handy_app_lib::commands][INFO] Shortcuts initialized successfully
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
whisper_init_from_file_with_params_no_state: loading model from '/home/john/.local/share/com.pais.handy/models/ggml-small.bin'
whisper_init_with_params_no_state: use gpu = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw = 0
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:53:10][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 193.402261ms
[2026-02-20][22:53:10][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
ggml_vulkan: Found 1 Vulkan devices:
Vulkan0: Intel(R) UHD Graphics 620 (KBL GT2) (Intel open-source Mesa driver) | uma: 1 | fp16: 1 | warp size: 32
whisper_init_with_params_no_state: backends = 2
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 768
whisper_model_load: n_text_head = 12
whisper_model_load: n_text_layer = 12
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 3 (small)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs = 99
whisper_model_load: Vulkan0 total size = 487.01 MB
whisper_model_load: model size = 487.01 MB
whisper_backend_init_gpu: using Vulkan backend
whisper_init_state: kv self size = 18.87 MB
whisper_init_state: kv cross size = 56.62 MB
whisper_init_state: kv pad size = 4.72 MB
whisper_init_state: compute buffer (conv) = 23.37 MB
whisper_init_state: compute buffer (encode) = 128.01 MB
whisper_init_state: compute buffer (cross) = 6.18 MB
whisper_init_state: compute buffer (decode) = 98.19 MB
whisper_full_with_state: auto-detected language: en (p = 0.970003)
[2026-02-20][22:53:24][handy_app_lib::managers::transcription][INFO] Transcription completed in 11139ms
[2026-02-20][22:53:24][handy_app_lib::managers::transcription][INFO] Transcription result: And last to test.
[2026-02-20][22:53:24][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:53:24][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:53:24][handy_app_lib::clipboard][INFO] Using ydotool for key combo
System Information
App Version:
v. 0.7.7
Operating System:
Fedora 43 KDE, KWin
CPU:
Intel i7-8550U
GPU:
NVIDIA GeForce MX150
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working