Speech Transcription
Converts spoken audio into written text, supporting a wide range of languages, accents, and audio quality levels.
Whisper is a general-purpose speech recognition model developed by OpenAI and made available via the OpenAI API under the model ID whisper-1. It was trained on a large dataset of diverse audio, enabling it to handle a wide range of accents, background noise conditions, and technical vocabulary. What distinguishes Whisper is its multitask design: it can perform not only speech-to-text transcription but also speech translation into English and automatic language identification within a single model. Whisper is well suited for developers building transcription pipelines, subtitle generation tools, voice interfaces, or any application that requires converting spoken audio into structured text. It supports multilingual input, making it useful for global applications where audio may arrive in different languages. The model accepts common audio formats and returns transcriptions or translations as plain text or with optional timestamps.
High-signal model metadata in a structured two-column overview table.
The entity that provides this model.
The number of tokens supported by the input context window.
The number of tokens that can be generated by the model in a single request.
Whether the model's code is available for public use.
When the model was first released.
When the model's knowledge was last updated.
The providers that offer this model. This is not an exhaustive list.
Types of data this model can process.
A fuller summary of positioning, capabilities, and source-specific details for Whisper.
Whisper is a general-purpose speech recognition model developed by OpenAI and made available via the OpenAI API under the model ID whisper-1. It was trained on a large dataset of diverse audio, enabling it to handle a wide range of accents, background noise conditions, and technical vocabulary. What distinguishes Whisper is its multitask design: it can perform not only speech-to-text transcription but also speech translation into English and automatic language identification within a single model.
Whisper is well suited for developers building transcription pipelines, subtitle generation tools, voice interfaces, or any application that requires converting spoken audio into structured text. It supports multilingual input, making it useful for global applications where audio may arrive in different languages. The model accepts common audio formats and returns transcriptions or translations as plain text or with optional timestamps.
Converts spoken audio into written text, supporting a wide range of languages, accents, and audio quality levels.
Translates spoken audio from supported non-English languages directly into English text in a single pass.
Automatically detects the language spoken in an audio file without requiring the caller to specify it in advance.
Optionally returns word- or segment-level timestamps alongside transcribed text, useful for subtitle and caption generation.
Accepts multiple common audio formats including mp3, mp4, mpeg, mpga, m4a, wav, and webm via the API.
Primary API pricing shown in the same “quick compare” spirit as the reference page.
Places where this model is available, based on the synced detail-page metadata.
Official model cards, release notes, docs, and other references synced from the source page.
Whisper discussions are most active in r/pics, r/aww, r/SonicTheHedgehog. The strongest match in this snapshot has 96547 upvotes and 1854 comments.
So I was there at the start of whisper when it was a pure shower thought app, reconnected during the dark days where it was a scam and hookup app, and stayed until the end where it was a lifeless husk of an app that only worked intermittently.
They were fun days, talked to a lot of fun, fucked up and some disturbed people, and had a lot of fun talking to Nigerian scammers on the phone.
I migrated to some of the replacements and I've had some fun. I'm not going to badmouth them, they're all doing a decent job at recreating the experience under very tricky circumstances, it's not easy to monetise AND innovate AND keep everyone safe, users moan a lot and expect the moon on a stick without parting with a penny.
But, let's be brutally honest, Whisper at the end were the dregs of the community that made it special. These new apps are competing for the few of those dregs that didn't give up the dream. I'm sure there are a few good people on there, but for every good one, there are a hundred posting the same horny shit 20 times a day, not authentically engaging with the community, just there to DM any woman with a pulse or post ragebait.
I've noticed that at the end of 2025 some of the good folks have stopped posting altogether, as I've looked over the past few days there have been no interesting posts to engage with. The reanimated corpse is bereft of life, however much it wants to pretend that more American politic ragebait shows genuine engaged users.
So I just wanted to say goodbye to the idea of Whisper. You were lightning in a bottle and built a fantastic community of freaks. You are sorely missed.
For those of you clinging to the dream, I wish you well. The replacement apps themselves aren't that terrible when you consider the things I said above. I'm sure with sustained effort and some new blood, the anonymous shower thought app can make a comeback for a new generation.
Thank you for reading my rant and my thoughts. To anyone out there that recalls talking to plonker, thank you for the good times and I wish you well x
War perk -> Grim Retribution, Collosal Foe and Maligant Invasion
Everytime you get an whisper gain, you have a higher chance to get ambush. So if you complete a Whisper objective even if its 1 point, you can get ambushed at a high rate.
These ambush drop the rewards of the whsiper cache. What inside whisper cache now is gift of tree :).
So you can accelerate gettingthem
Saw an ad for this on FB. Has anyone checked it out yet? I’m an audio book only “reader”. Do I really need another subscription account? Worth it?
The OpenAI API enforces a 25 MB file size limit per audio file submitted to the Whisper endpoint.
Whisper is an audio model, not a text model, so it does not have a token-based context window. Audio inputs are processed in segments internally.
Whisper supports transcription in dozens of languages. It was trained on multilingual audio data and can identify and transcribe many of the world's most widely spoken languages.
Yes. Whisper's translation capability converts spoken audio in supported non-English languages into English text. Translation into languages other than English is not supported by the model.
Whisper is billed per minute of audio processed. Pricing details are published on OpenAI's pricing page and may change over time.
Continue browsing adjacent models from the same provider.