Automatically Transcribing Audio to Text on Your Computer with Whisper Transcription

Hypercomputing: «Good AI prevents bad operators from using computers to conduct useless tasks at high speeds to many degrees of vacuous precision.»
Daoming Sochua, Scientific Morality, Vol VIII in «Civilization: Beyond Earth»

Sometimes there are nuggets of gold in podcasts/vidcasts. Just quotations that are worth writing down. If it happens while listening on a smartphone, a simple screenshot (on iOS: volume up + on/off) will capture the time code. But transcribing them is … annoying.

Well, there’s an app for that.

You’ll find «Whisper Transcription» in the app store. The basic version is free. And even in the free version, voice-to-text is impressive. If you pay for the app (one time or subscription), you get models with higher degrees of precision.

Best of all, the whole thing runs on your computer, so no need to transmit potentially private information to other people’s server. (Downside: It is … computationally intensive …)

The paid version offers batch transcribe. Just throwing some audio cuts at it.

I’m still trying out the app, but so far, damn. Tested it with a podcast episode — «You Must Be Some Kind of Therapist» 85. Mass Delusional Psychosis – Dr. Mark McDonald on How Fear is Destroying America — and the results were impressive. Only found one real mistake, it did not recognize the name «Haidt».

Then I just threw some movie clips at it.

For example, this cut from «Vikings» (TV series, there is a beautiful scene in which an old viking wants to go on a last raid to die in battle):

[footsteps] What is your name? Tostig Lord Ragnar. Do you swear allegiance and fealty to me and to my family from this day forth? It won’t be so long then. By my sacred rings, I swear. But I also have a favor to ask, Lord. What is this favor? That the next time you go raiding, you take me with you. [laughter] I do not wish to insult you, but the truth is you are… That I am too old. [laughs] Yes. I am old, but I have been a warrior all my life. Many years I sailed with Lord Harasson and fought battles against the Eastlanders. And I watched all the companions of my youth die. And though I fought with them in the shield wall, never once was I touched by a blade. All the friends and companions of my youth are dead and feasting and drinking with the acer in the halls of the gods. Well, I… I am forsaken. Bereft. Which is why I beg you, Lord, gift me the chance to die with honor in battle and join my friends in Valhalla. This summer, we shall have more ships to go west, for that is our future. When we return to England, let’s take him with us. All in favor? Aye! Aye! Aye! [music] ♪♪♪♪

Wait, it recognizes laughter?!? Only mistakes I did find was «acer» instead of «Aesir» and «Well» instead of «While», but that can be forgiven (names might be wrong too, could look at the subtitles, but nah). Of course, speaker changes are not indicated, but after a little manual reformatting you get a useable quotation:

«What is your name?»
«Tostig Lord Ragnar.»
«Do you swear allegiance and fealty to me and to my family from this day forth?»
«It won’t be so long then.»
«By my sacred rings, I swear. But I also have a favor to ask, Lord.»
«What is this favor?»
«That the next time you go raiding, you take me with you.»
[laughter]
«I do not wish to insult you, but the truth is you are…»
«That I am too old. [laughs] Yes. I am old, but I have been a warrior all my life. Many years I sailed with Lord Harasson and fought battles against the Eastlanders. And I watched all the companions of my youth die. And though I fought with them in the shield wall, never once was I touched by a blade. All the friends and companions of my youth are dead and feasting and drinking with the Aesir in the halls of the gods. While I… I am forsaken. Bereft. Which is why I beg you, Lord, gift me the chance to die with honor in battle and join my friends in Valhalla.»
«This summer, we shall have more ships to go west, for that is our future. When we return to England, let’s take him with us. All in favor?»
«Aye! Aye! Aye!»
«Vikings»

Of course, sometimes there is not that much to translate, and that app still does an impressive job (second part, the viking gets his wish, he dies honorably and with Valhalla as his last words):

[MUSIC PLAYING] [GRUNTING] [YELLING] [GRUNTING] [GROANING] [GROANING] [GROANING] [GROANING] [MUSIC PLAYING] [GROANING]

Okay, it missed the soft «Valhalla», when the old viking dies, but still … impressive. And hey, it even seems to recognize when a person is quoting something (using a famous Ronald Reagan clip):

The nine most terrifying words in the English language are, “I’m from the government and I’m here to help.”

It did set the quotation marks by itself.

In a more serious context (well, besides dying honorably in battle), even if you have to listen to the recording to insert the line breaks and speaker changes, the app saves a tremendous amount of typing.

Well worth trying out.