First impression of whisper
Tried out openai/whisper
Whisper is a general-purpose speech recognition model.
Install
It also requires the command-line tool ffmpeg to be installed on your system,
OMG. Installing one package on brew takes so long
Had some errors trying to run whisper
Fixed the error by running following command (change 3.8 to your version of python)
Run
Command I used for transcribing
Using any other models bigger than 'base' was way too slow for my old macbook (2014).
Output
Command line shows some sample results on the terminal. Generates transcribed files: _.vtt and _.txt
Impressions
Awesome speech recognition for English. It can pick up my voice reading stuff in English with some accent flawlessly.
However, voice memo that contains a conversation between me and my then-4yo daughter in Korean wasn't as impressive. Much more room to improve on Korean and possibly other non-English languages.
Grateful that these things are available.