X Bookmarks — 2023 KW46: Distil-Whisper drops Whisper latency by 6x

@LiorOnAI — distil-whisper cuts Whisper latency by 6x

A team just made OpenAI Whisper 6x faster, 49% smaller, while keeping 99% of the accuracy.

The model is already available on the HuggingFace Transformers library:

model_id = "distil-whisper/distil-large-v2"

You can also use their web UI to transcribe from URLs, files, or microphone.

That's a meaningful compression result — 49% size reduction with 1% accuracy loss is the kind of tradeoff that makes a model actually deployable in latency-sensitive contexts where the full large-v2 would be too slow. Six times the throughput on the same hardware changes what you can build. Saving this to run against our transcription pipeline.

Lior Alexander

@LiorOnAI

·Follow

A team just made OpenAI Whisper 6x faster, 49% smaller, while keeping 99% of the accuracy. The model is already available on the HuggingFace Transformers library: model_id = "distil-whisper/distil-large-v2" You can also use their web UI to transcribe from URLs, files, or Show more

Watch on X

6:42 PM · Nov 14, 2023

1.7K

Read 23 replies

X Bookmarks — 2023 KW46: Distil-Whisper drops Whisper latency by 6x

@LiorOnAI — distil-whisper cuts Whisper latency by 6x

More on bookmarks

Other posts