Transcribe YouTube and audio,
free and without uploading anything
Paste a YouTube link or drop an audio file. Transcription runs in your browser and your audio never leaves your device.
Why OpenTranscript
Your audio stays on your device, full stop.
Your audio goes nowhere
Whisper runs inside your browser. There's no middleman server, no upload, and nothing stored in any database.
YouTube: paste the link and you're done
We pull captions directly from YouTube. You don't need to install any extension or download the video.
Actually free, no tricks
Your machine does the work. It costs us nothing to serve you, so we have no reason to charge you. There's no account, no minute limits, and no "7-day free trial" to worry about.
Adapts to your hardware
We detect whether you have a compatible GPU and how much RAM you have. Powerful GPU = large, accurate model. Old laptop = lightweight model that works just as well.
What you can use it for
Not just YouTube. Any audio you need to turn into text.
Transcribe podcasts
Turn your episodes into text to write show notes, articles, or find that one clip you can never locate.
Transcribe meetings
Record the meeting on your phone, drag the audio here, and in minutes you have a complete set of minutes. No one wastes time taking notes.
Transcribe lectures and talks
For students: record the lecture, transcribe it, and review in writing. Way better than trying to scribble everything down by hand.
Get text from YouTube videos
Need the text of a video to quote, summarize, or translate? Paste the link, copy the result. Two clicks.
Transcribe interviews
Journalists, researchers, UX researchers: transcribe your interviews without uploading recordings to third-party services.
Accessibility
Convert audio content into text for people with hearing impairments or for anyone who simply prefers to read.
How it works
Three steps. No signup required, no waiting.
Paste the link or drop your audio
YouTube link or drag in an mp3, wav, m4a. Your file never leaves the browser.
We process the text
For YouTube we grab captions directly. For audio files, Whisper transcribes on your device using your CPU or GPU.
Copy or download
Text ready to paste wherever you need. Download as .txt or .md with metadata included.
OpenTranscript vs. other services
The main difference: your audio stays on your device.
| OpenTranscript | Typical services | |
|---|---|---|
| Cost | Free, always | $0.006 – $0.05 / minute |
| Privacy | Audio never leaves your device | Your audio is uploaded to their servers |
| Sign-up | Not required | Mandatory |
| Minute limit | No limit | Limited on the free plan |
| Speed | Depends on your hardware | Dedicated GPU servers |
| Maximum accuracy | whisper-small (very good) | whisper-large (excellent) |
Compare Whisper models
Bigger = more accurate, but heavier. We automatically pick the best one for your device.
whisper-tiny
- Size
- 75 MB
- Speed
- Very fast
- Accuracy
- Good
- Device
- CPU
whisper-base
- Size
- 145 MB
- Speed
- Fast
- Accuracy
- Very good
- Device
- GPU / CPU
whisper-small
- Size
- 480 MB
- Speed
- Moderate
- Accuracy
- Excellent
- Device
- GPU
Languages Whisper can transcribe
Whisper recognises over 99 languages. These are the most widely used.
Frequently asked questions
The stuff everyone wonders before they try it.
Is my audio uploaded to a server?
No. The Whisper model downloads once to your browser and processes everything locally. Your files never leave your device at any point.
Is it actually free? What's the catch?
It's free because your own device does the computing, not our servers. We have zero compute costs to pass on to you. No minute limits, no file limits.
Does it work with any YouTube video?
It works with videos that have captions available, which is most of them. If a video has no captions, download the audio and drag it here. Whisper will transcribe it.
How long does transcription take?
Depends on your hardware. With a compatible GPU (WebGPU in Chrome or Edge), a 5-minute audio takes around 15–30 seconds. Without GPU, expect 1–3 minutes. The first run takes longer because it downloads the model.
What audio formats are supported?
mp3, wav, m4a, ogg, and webm. Video formats like mp4 also work in most modern browsers.
What languages can it transcribe?
Whisper is multilingual: English, Spanish, French, German, Italian, Portuguese, Japanese, Chinese, Arabic, and many more. You can force a language or let it auto-detect.
Which browser do I need?
Any modern browser works. For top speed with WebGPU you need Chrome 113+ or Edge 113+. Firefox and Safari run it on CPU, a bit slower but just as accurate.
Why is the first run slower?
The first time, it downloads the Whisper model (between 75 MB and 480 MB depending on the tier). It gets cached in your browser after that, so subsequent runs start instantly.
How accurate is the transcription?
It depends on the model. whisper-small (480 MB) delivers very high accuracy for major languages. whisper-tiny is faster but makes more mistakes with accents or background noise. For meetings with decent audio quality, all three models produce very usable results.
Does it work on mobile?
Yes, but it's slower. Mobile devices don't have WebGPU, so Whisper runs on the CPU. A 5-minute audio can take 3–5 minutes on a phone. On a laptop or desktop the experience is much better.
Is there an audio length limit?
There's no imposed limit. The only constraint is your device's RAM. Audio files up to 2–3 hours work without issues on devices with 8 GB of RAM or more.
Is my data safe? Is it GDPR compliant?
Your audio never leaves your device, so there's no personal data for us to protect on our end. We don't use tracking cookies or collect personal information. It's about as GDPR-friendly as a tool can get.
Transcribe now
No account, nothing to install, your audio stays on your device.