whisper is definitely nice, but it's a bit too slow.
Having subtitles and transcription for everything is great - but Nemo Parakeet (pretty much whisper by nvidia) completely changed how I interact with the computer.
It enables dictation that actually works and it's as fast as you can think.
I also have a set of scripts which just wait for voice commands and do things.
I can pipe the results to an LLM, run commands, synthesize a voice with F5-TTS back and it's like having a local Jarvis.
Yeah, mind sharing any of the scripts? I looked at the docs briefly, looks like we need to install ALL of nemo to get access to Parakeet? Seems ultra heavy.
It enables dictation that actually works and it's as fast as you can think. I also have a set of scripts which just wait for voice commands and do things. I can pipe the results to an LLM, run commands, synthesize a voice with F5-TTS back and it's like having a local Jarvis.
The main limitation is being english only.