Skip to content

[BUG] Whisper Models Fail After Choosing >= Medium #870

@suuuehgi

Description

@suuuehgi

Bug Description

I noticed that only the small whisper model works (c.f. #261).

When I switch to a larger model like the medium one, something irreversibly breaks:
(The subsequent p values are << 1)

whisper_full_with_state: auto-detected language: en (p = 0.998792)
[2026-02-20][22:59:23][handy_app_lib::managers::transcription][INFO] Transcription completed in 48994ms
[2026-02-20][22:59:23][handy_app_lib::managers::transcription][INFO] Transcription result: prob prob prob. at N! N!.

It seems to work up to the language detection (that is still correct) and breaks afterwards.
Afterwards the models produce only glibberish, even when I switch back to whisper small.

ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:45:01][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 126.292029ms
[2026-02-20][22:45:01][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: en (p = 0.577720)
[2026-02-20][22:45:15][handy_app_lib::managers::transcription][INFO] Transcription completed in 11258ms
[2026-02-20][22:45:15][handy_app_lib::managers::transcription][INFO] Transcription result: blah, blah, blah, blah, test.
[2026-02-20][22:45:15][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:45:15][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:45:15][handy_app_lib::clipboard][INFO] Using ydotool for key combo
[2026-02-20][22:45:31][transcribe_rs::engines::parakeet::model][INFO] Loading quantized model from encoder-model.int8.onnx...
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'encoder-model.int8.onnx' input: name=audio_signal, type=Tensor { ty: Float32, shape:[-1, 128, -1], dimension_symbols: SymbolicDimensions(["audio_signal_dynamic_axes_1", "", "audio_signal_dynamic_axes_2"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'encoder-model.int8.onnx' input: name=length, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Loading quantized model from decoder_joint-model.int8.onnx...
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=encoder_outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["encoder_outputs_dynamic_axes_1", "", "encoder_outputs_dynamic_axes_2"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=targets, type=Tensor { ty: Int32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["targets_dynamic_axes_1", "targets_dynamic_axes_2"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=target_length, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=input_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_1_dynamic_axes_1", ""]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=input_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_2_dynamic_axes_1", ""]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Loading model from nemo128.onnx...
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'nemo128.onnx' input: name=waveforms, type=Tensor { ty: Float32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["batch_size", "N"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Model 'nemo128.onnx' input: name=waveforms_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) }
[2026-02-20][22:45:34][transcribe_rs::engines::parakeet::model][INFO] Loaded vocabulary with 8193 tokens, blank_idx=8192
whisper_init_from_file_with_params_no_state: loading model from '/home/john/.local/share/com.pais.handy/models/ggml-small.bin'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_init_with_params_no_state: backends   = 2
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head  = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 768
whisper_model_load: n_text_head   = 12
whisper_model_load: n_text_layer  = 12
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 3 (small)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:  Vulkan0 total size =   487.01 MB
whisper_model_load: model size    =  487.01 MB
whisper_backend_init_gpu: using Vulkan backend
whisper_init_state: kv self size  =   18.87 MB
whisper_init_state: kv cross size =   56.62 MB
whisper_init_state: kv pad  size  =    4.72 MB
whisper_init_state: compute buffer (conv)   =   23.37 MB
whisper_init_state: compute buffer (encode) =  128.01 MB
whisper_init_state: compute buffer (cross)  =    6.18 MB
whisper_init_state: compute buffer (decode) =   98.19 MB
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:45:41][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 132.348507ms
[2026-02-20][22:45:41][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: en (p = 0.962492)
[2026-02-20][22:45:53][handy_app_lib::managers::transcription][INFO] Transcription completed in 11451ms
[2026-02-20][22:45:53][handy_app_lib::managers::transcription][INFO] Transcription result: This is a test.
[2026-02-20][22:45:53][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:45:53][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:45:53][handy_app_lib::clipboard][INFO] Using ydotool for key combo
[2026-02-20][22:46:09][handy_app_lib::managers::model][INFO] Starting fresh download of model medium from [URL]
[2026-02-20][22:46:47][handy_app_lib::managers::model][INFO] Successfully downloaded model medium to "/home/john/.local/share/com.pais.handy/models/whisper-medium-q4_1.bin"
whisper_init_from_file_with_params_no_state: loading model from '/home/john/.local/share/com.pais.handy/models/whisper-medium-q4_1.bin'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_init_with_params_no_state: backends   = 2
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head  = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1024
whisper_model_load: n_text_head   = 16
whisper_model_load: n_text_layer  = 24
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 3
whisper_model_load: qntvr         = 2
whisper_model_load: type          = 4 (medium)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:  Vulkan0 total size =   491.23 MB
whisper_model_load: model size    =  491.23 MB
whisper_backend_init_gpu: using Vulkan backend
whisper_init_state: kv self size  =   50.33 MB
whisper_init_state: kv cross size =  150.99 MB
whisper_init_state: kv pad  size  =    6.29 MB
whisper_init_state: compute buffer (conv)   =   29.51 MB
whisper_init_state: compute buffer (encode) =  170.15 MB
whisper_init_state: compute buffer (cross)  =    7.72 MB
whisper_init_state: compute buffer (decode) =   99.11 MB
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:46:53][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 128.179251ms
[2026-02-20][22:46:53][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:46:58][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 130.732617ms
[2026-02-20][22:46:58][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: nl (p = 0.010000)
[2026-02-20][22:47:48][handy_app_lib::managers::transcription][INFO] Transcription completed in 48429ms
[2026-02-20][22:47:48][handy_app_lib::managers::transcription][INFO] Transcription result: prob prob prob. at!!
[2026-02-20][22:47:48][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:47:48][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:47:48][handy_app_lib::clipboard][INFO] Using ydotool for key combo
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:47:53][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 126.849091ms
[2026-02-20][22:47:53][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: en (p = 0.452633)
[2026-02-20][22:48:44][handy_app_lib::managers::transcription][INFO] Transcription completed in 49879ms
[2026-02-20][22:48:44][handy_app_lib::managers::transcription][INFO] Transcription result: prob!.
[2026-02-20][22:48:44][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:48:44][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:48:44][handy_app_lib::clipboard][INFO] Using ydotool for key combo
whisper_init_from_file_with_params_no_state: loading model from '/home/john/.local/share/com.pais.handy/models/ggml-small.bin'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_init_with_params_no_state: backends   = 2
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head  = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 768
whisper_model_load: n_text_head   = 12
whisper_model_load: n_text_layer  = 12
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 3 (small)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:  Vulkan0 total size =   487.01 MB
whisper_model_load: model size    =  487.01 MB
whisper_backend_init_gpu: using Vulkan backend
whisper_init_state: kv self size  =   18.87 MB
whisper_init_state: kv cross size =   56.62 MB
whisper_init_state: kv pad  size  =    4.72 MB
whisper_init_state: compute buffer (conv)   =   23.37 MB
whisper_init_state: compute buffer (encode) =  128.01 MB
whisper_init_state: compute buffer (cross)  =    6.18 MB
whisper_init_state: compute buffer (decode) =   98.19 MB
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:48:57][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 129.716723ms
[2026-02-20][22:48:57][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: nl (p = 0.010000)
[2026-02-20][22:49:10][handy_app_lib::managers::transcription][INFO] Transcription completed in 11416ms
[2026-02-20][22:49:10][handy_app_lib::managers::transcription][INFO] Transcription result: Number
[2026-02-20][22:49:10][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:49:10][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:49:10][handy_app_lib::clipboard][INFO] Using ydotool for key combo
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:49:16][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 128.857267ms
[2026-02-20][22:49:16][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: nl (p = 0.010000)
[2026-02-20][22:49:29][handy_app_lib::managers::transcription][INFO] Transcription completed in 11311ms
[2026-02-20][22:49:29][handy_app_lib::managers::transcription][INFO] Transcription result: jam
[2026-02-20][22:49:29][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:49:29][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:49:29][handy_app_lib::clipboard][INFO] Using ydotool for key combo
[2026-02-20][22:49:35][transcribe_rs::engines::parakeet::model][INFO] Loading quantized model from encoder-model.int8.onnx...
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'encoder-model.int8.onnx' input: name=audio_signal, type=Tensor { ty: Float32, shape:[-1, 128, -1], dimension_symbols: SymbolicDimensions(["audio_signal_dynamic_axes_1", "", "audio_signal_dynamic_axes_2"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'encoder-model.int8.onnx' input: name=length, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Loading quantized model from decoder_joint-model.int8.onnx...
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=encoder_outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["encoder_outputs_dynamic_axes_1", "", "encoder_outputs_dynamic_axes_2"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=targets, type=Tensor { ty: Int32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["targets_dynamic_axes_1", "targets_dynamic_axes_2"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=target_length, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=input_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_1_dynamic_axes_1", ""]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=input_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_2_dynamic_axes_1", ""]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Loading model from nemo128.onnx...
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'nemo128.onnx' input: name=waveforms, type=Tensor { ty: Float32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["batch_size", "N"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Model 'nemo128.onnx' input: name=waveforms_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) }
[2026-02-20][22:49:37][transcribe_rs::engines::parakeet::model][INFO] Loaded vocabulary with 8193 tokens, blank_idx=8192
whisper_init_from_file_with_params_no_state: loading model from '/home/john/.local/share/com.pais.handy/models/ggml-small.bin'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_init_with_params_no_state: backends   = 2
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head  = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 768
whisper_model_load: n_text_head   = 12
whisper_model_load: n_text_layer  = 12
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 3 (small)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:  Vulkan0 total size =   487.01 MB
whisper_model_load: model size    =  487.01 MB
whisper_backend_init_gpu: using Vulkan backend
whisper_init_state: kv self size  =   18.87 MB
whisper_init_state: kv cross size =   56.62 MB
whisper_init_state: kv pad  size  =    4.72 MB
whisper_init_state: compute buffer (conv)   =   23.37 MB
whisper_init_state: compute buffer (encode) =  128.01 MB
whisper_init_state: compute buffer (cross)  =    6.18 MB
whisper_init_state: compute buffer (decode) =   98.19 MB
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:49:47][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 129.218878ms
[2026-02-20][22:49:47][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: nn (p = 0.624495)
[2026-02-20][22:50:51][handy_app_lib::managers::transcription][INFO] Transcription completed in 62476ms
[2026-02-20][22:50:51][handy_app_lib::managers::transcription][INFO] Transcription result: Mi
[2026-02-20][22:50:51][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:50:51][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:50:51][handy_app_lib::clipboard][INFO] Using ydotool for key combo
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:50:59][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 127.533426ms
[2026-02-20][22:50:59][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: en (p = 0.414875)
[2026-02-20][22:52:12][handy_app_lib::managers::transcription][INFO] Transcription completed in 70039ms
[2026-02-20][22:52:12][handy_app_lib::managers::transcription][INFO] Transcription result: !!!!
[2026-02-20][22:52:12][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:52:12][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:52:12][handy_app_lib::clipboard][INFO] Using ydotool for key combo
[2026-02-20][22:52:18][transcribe_rs::engines::parakeet::model][INFO] Loading quantized model from encoder-model.int8.onnx...
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'encoder-model.int8.onnx' input: name=audio_signal, type=Tensor { ty: Float32, shape:[-1, 128, -1], dimension_symbols: SymbolicDimensions(["audio_signal_dynamic_axes_1", "", "audio_signal_dynamic_axes_2"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'encoder-model.int8.onnx' input: name=length, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["length_dynamic_axes_1"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Loading quantized model from decoder_joint-model.int8.onnx...
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=encoder_outputs, type=Tensor { ty: Float32, shape: [-1, 1024, -1], dimension_symbols: SymbolicDimensions(["encoder_outputs_dynamic_axes_1", "", "encoder_outputs_dynamic_axes_2"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=targets, type=Tensor { ty: Int32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["targets_dynamic_axes_1", "targets_dynamic_axes_2"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=target_length, type=Tensor { ty: Int32, shape: [-1], dimension_symbols: SymbolicDimensions(["target_length_dynamic_axes_1"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=input_states_1, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_1_dynamic_axes_1", ""]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'decoder_joint-model.int8.onnx' input: name=input_states_2, type=Tensor { ty: Float32, shape: [2, -1, 640], dimension_symbols: SymbolicDimensions(["", "input_states_2_dynamic_axes_1", ""]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Loading model from nemo128.onnx...
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'nemo128.onnx' input: name=waveforms, type=Tensor { ty: Float32, shape: [-1, -1], dimension_symbols: SymbolicDimensions(["batch_size", "N"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Model 'nemo128.onnx' input: name=waveforms_lens, type=Tensor { ty: Int64, shape: [-1], dimension_symbols: SymbolicDimensions(["batch_size"]) }
[2026-02-20][22:52:20][transcribe_rs::engines::parakeet::model][INFO] Loaded vocabulary with 8193 tokens, blank_idx=8192
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:52:26][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 128.302799ms
[2026-02-20][22:52:26][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
[2026-02-20][22:52:28][handy_app_lib::managers::transcription][INFO] Transcription completed in 335ms
[2026-02-20][22:52:28][handy_app_lib::managers::transcription][INFO] Transcription result: A last test.
[2026-02-20][22:52:28][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:52:28][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:52:28][handy_app_lib::clipboard][INFO] Using ydotool for key combo
whisper_init_from_file_with_params_no_state: loading model from '/home/john/.local/share/com.pais.handy/models/ggml-small.bin'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_init_with_params_no_state: backends   = 2
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head  = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 768
whisper_model_load: n_text_head   = 12
whisper_model_load: n_text_layer  = 12
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 3 (small)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:  Vulkan0 total size =   487.01 MB
whisper_model_load: model size    =  487.01 MB
whisper_backend_init_gpu: using Vulkan backend
whisper_init_state: kv self size  =   18.87 MB
whisper_init_state: kv cross size =   56.62 MB
whisper_init_state: kv pad  size  =    4.72 MB
whisper_init_state: compute buffer (conv)   =   23.37 MB
whisper_init_state: compute buffer (encode) =  128.01 MB
whisper_init_state: compute buffer (cross)  =    6.18 MB
whisper_init_state: compute buffer (decode) =   98.19 MB
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:52:40][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 130.696382ms
[2026-02-20][22:52:40][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
whisper_full_with_state: auto-detected language: nl (p = 0.010000)
^C⏎ 
~/A/squashfs-root [SIGINT]> ./AppRun --debug
[2026-02-20][22:53:00][handy_app_lib::managers::history][INFO] Initializing database at "/home/john/.local/share/com.pais.handy/history.db"
[2026-02-20][22:53:00][handy_app_lib::commands][INFO] Enigo initialized successfully after permission grant
[2026-02-20][22:53:00][handy_app_lib::commands][INFO] Shortcuts initialized successfully
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
whisper_init_from_file_with_params_no_state: loading model from '/home/john/.local/share/com.pais.handy/models/ggml-small.bin'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM pulse
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM jack
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) [error.pcm] Unknown PCM oss
[2026-02-20][22:53:10][handy_app_lib::managers::audio][INFO] Microphone stream initialized in 193.402261ms
[2026-02-20][22:53:10][handy_app_lib::audio_toolkit::audio::recorder][INFO] Using device: Ok("RODE NT-USB")
Sample rate: 16000
Channels: 1
Format: F32
ggml_vulkan: Found 1 Vulkan devices:
Vulkan0: Intel(R) UHD Graphics 620 (KBL GT2) (Intel open-source Mesa driver) | uma: 1 | fp16: 1 | warp size: 32
whisper_init_with_params_no_state: backends   = 2
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head  = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 768
whisper_model_load: n_text_head   = 12
whisper_model_load: n_text_layer  = 12
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 3 (small)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:  Vulkan0 total size =   487.01 MB
whisper_model_load: model size    =  487.01 MB
whisper_backend_init_gpu: using Vulkan backend
whisper_init_state: kv self size  =   18.87 MB
whisper_init_state: kv cross size =   56.62 MB
whisper_init_state: kv pad  size  =    4.72 MB
whisper_init_state: compute buffer (conv)   =   23.37 MB
whisper_init_state: compute buffer (encode) =  128.01 MB
whisper_init_state: compute buffer (cross)  =    6.18 MB
whisper_init_state: compute buffer (decode) =   98.19 MB
whisper_full_with_state: auto-detected language: en (p = 0.970003)
[2026-02-20][22:53:24][handy_app_lib::managers::transcription][INFO] Transcription completed in 11139ms
[2026-02-20][22:53:24][handy_app_lib::managers::transcription][INFO] Transcription result: And last to test.
[2026-02-20][22:53:24][handy_app_lib::clipboard][INFO] Using paste method: CtrlShiftV, delay: 60ms
[2026-02-20][22:53:24][handy_app_lib::clipboard][INFO] Using wl-copy for clipboard write on Wayland
[2026-02-20][22:53:24][handy_app_lib::clipboard][INFO] Using ydotool for key combo

System Information

App Version:

v. 0.7.7

Operating System:

Fedora 43 KDE, KWin

CPU:

Intel i7-8550U

GPU:

NVIDIA GeForce MX150

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions