ElevenLabs

Scribe v1

Scribe v1 is ElevenLabs' original speech-to-text model, designed to convert spoken audio into written transcripts. Built as the foundation of ElevenLabs' transcription offering, it enables developers and creators to automatically transcribe audio and video content through the ElevenLabs API. The model supports transcription across multiple languages, making it usable in multilingual workflows and automation pipelines. Scribe v1 has been deployed in use cases ranging from voice note capture to content production tooling. It has since been succeeded by Scribe v2, which adds features such as support for 90+ languages, speaker diarization for up to 32 speakers, word-level timestamps, and entity detection. Developers starting new projects are directed by ElevenLabs to use Scribe v2, while Scribe v1 remains available for existing integrations.

Unknown N/A context N/A output
Audio Transcription Multilingual Support API Access Transcript Output

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

ElevenLabs

Input Context Window

The number of tokens supported by the input context window.

N/A tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

N/A tokens

Open Source

Whether the model's code is available for public use.

No

Release Date

When the model was first released.

Unknown

Knowledge Cut-off Date

When the model's knowledge was last updated.

Unknown

API Providers

The providers that offer this model. This is not an exhaustive list.

ElevenLabs

Modalities

Types of data this model can process.

Text Video Audio

What is Scribe v1

A fuller summary of positioning, capabilities, and source-specific details for Scribe v1.

Scribe v1 is ElevenLabs' original speech-to-text model, designed to convert spoken audio into written transcripts. Built as the foundation of ElevenLabs' transcription offering, it enables developers and creators to automatically transcribe audio and video content through the ElevenLabs API. The model supports transcription across multiple languages, making it usable in multilingual workflows and automation pipelines.

Scribe v1 has been deployed in use cases ranging from voice note capture to content production tooling. It has since been succeeded by Scribe v2, which adds features such as support for 90+ languages, speaker diarization for up to 32 speakers, word-level timestamps, and entity detection. Developers starting new projects are directed by ElevenLabs to use Scribe v2, while Scribe v1 remains available for existing integrations.

Capabilities

What Scribe v1 supports

AUD

Audio Transcription

Converts spoken audio from audio and video files into written text transcripts. Accessible via the ElevenLabs API for use in automated pipelines.

AI

Multilingual Support

Transcribes speech across a range of languages, enabling use in multilingual content workflows.

API

API Access

Available through the ElevenLabs API, allowing integration into developer workflows, automation pipelines, and third-party applications.

AI

Transcript Output

Returns transcription results as structured text output suitable for downstream processing, storage, or display.

Pricing for Scribe v1

Primary API pricing shown in the same “quick compare” spirit as the reference page.

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

ElevenLabs

Configuration & Parameters

The configurable options currently documented for this model.

Include Speakers

Select

Choose whether to include timing and speaker information in the transcription

Default: no
Yes No

Supported Request Parameters

Parameters currently listed by OpenRouter or the local catalog for this model.

Include Speakers

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Community discussion

What people think about Scribe v1

Scribe v1 discussions are most active in r/ElevenLabs, r/LocalLLaMA, r/macapps. Top Reddit threads cluster around benchmark and model-comparison threads.

The strongest match in this snapshot has 79 upvotes and 25 comments.

I shipped **Rokid-Scribe v1.1.1**, a voice note workflow for Rokid glasses.

The main reason I built it is simple: I didn’t really like the AI transcription flow in the **Hi Rokid** app. In my testing, it felt a bit slow (for long audio recording like 10+ minutes), and sometimes the transcripts were not that accurate + the summary function didn't work properly. So instead of being locked into one built-in transcription path, I went for a **custom multi-provider approach**.

With Rokid-Scribe, the flow is:

* Record on the glasses
* Import to the phone over local transport (Wifi or Bluetooth)
* Transcribe on the phone with the provider you want
* Keep everything local on the phone
* Export to `.txt` or `.pdf` or copy it on your clipboard

Current supported providers:

* **ElevenLabs**
* **AssemblyAI**
* **Speechmatics**
* **Deepgram**
* **Groq**

A nice side effect of the multi-provider setup is that you can pick what matters most to you:

* accuracy
* speed
* language support
* diarization / multi-speaker support
* free tier / cost

# Free tier snapshot

Checked on **April 13, 2026**. These can change, so always verify the official pricing pages.

**ElevenLabs**: free plan with **10k credits/month**, and STT is included. Their pricing page currently shows that this is roughly **13h38 of Scribe v2 transcription** on the free plan.

[https://elevenlabs.io/pricing/api?price.section=speech\_to\_text](https://elevenlabs.io/pricing/api?price.section=speech_to_text)

**AssemblyAI**: free tier currently advertised as **up to 333 hours** of transcription.

[https://www.assemblyai.com/pricing](https://www.assemblyai.com/pricing)

**Speechmatics**: free tier currently includes **480 minutes/month** of speech-to-text.

[https://www.speechmatics.com/pricing](https://www.speechmatics.com/pricing)

**Deepgram**: free signup currently includes **$200 in credits**, no credit card required.

[https://deepgram.com/pricing](https://deepgram.com/pricing)

**Groq**: free plan exists, but it’s more **rate-limit based** than a simple monthly credit bucket. For STT, the docs currently mention **25 MB max upload on free tier**.

[https://console.groq.com/docs/speech-to-text](https://console.groq.com/docs/speech-to-text) [https://console.groq.com/docs/rate-limits](https://console.groq.com/docs/rate-limits)

If you want something more flexible than the default transcription flow then this app is made for you.

I’d love feedback for the app and also if you need to add more/new providers let me know what works the best for you !

Repo: [https://github.com/Anezium/Rokid-Scribe](https://github.com/Anezium/Rokid-Scribe)

Release: [https://github.com/Anezium/Rokid-Scribe/releases/tag/v1.1.1](https://github.com/Anezium/Rokid-Scribe/releases/tag/v1.1.1)

Open Reddit thread

Hey everyone — big day for SayScribe.

The Mac app is officially live on the Mac App Store. Same app you know from iPhone and iPad, now native on macOS with full iCloud sync between all your devices. 

Universal Purchase — buy once, use everywhere on your Apple devices.

* Mac App Store/iPhone/iPad: [https://apps.apple.com/app/id6759438198](https://apps.apple.com/app/id6759438198)

Huge thanks to everyone who tested the Mac build. Feedback alwayswelcome here — post bugs, feature requests, anything.

Open Reddit thread
r/ElevenLabs 1 upvotes January 16, 2026
Is Scribe V1 having a stroke right now?

I'm getting total gibberish from a test audio segment that I've used a bunch with only minor transcription errors is suddenly returning gibberish!! Anyone else having issues? Language is Urdu to English.

Open Reddit thread
r/ElevenLabs 1 upvotes 5 comments November 12, 2025
No logprobs on Scribe v1

Hello

When I run transcriptions using Scribe v1 it seems like each token's logprob defaults to 0.0. I never get any value different than this, even for hallucinated transcriptions on low quality audios. My aim is to use these logprobs to compute some kind of a confidence level.

Are logprobs not available for Scribe v1 or am I doing something wrong?

Open Reddit thread
View more discussions →
FAQ

Common questions about Scribe v1

What is Scribe v1 used for?

Scribe v1 is used to transcribe spoken audio from audio and video files into written text. It has been used in workflows such as voice note capture, content production, and automated transcription pipelines via the ElevenLabs API.

Does Scribe v1 support multiple languages?

Yes, Scribe v1 supports transcription across multiple languages, making it suitable for multilingual workflows. However, its successor Scribe v2 expands this to 90+ languages.

What is the context window for Scribe v1?

No context window size is specified in the available metadata for Scribe v1, as it is a speech-to-text transcription model rather than a language model.

Has Scribe v1 been replaced by a newer model?

Yes. ElevenLabs has released Scribe v2, which adds speaker diarization for up to 32 speakers, support for 90+ languages, word-level timestamps, keyterm prompting, and entity detection. ElevenLabs recommends Scribe v2 for new applications.

How is Scribe v1 accessed?

Scribe v1 is accessible via the ElevenLabs API. It can be integrated into developer workflows and automation pipelines for audio and video transcription tasks.

More models from ElevenLabs

Continue browsing adjacent models from the same provider.

← All AI Models