Skip to content

Faster-Whisper Models Comparison

Model Variants Overview

Model Parameters Multilingual English-Only Distilled
tiny 39M .en
base 74M .en
small 244M .en distil-*.en
medium 769M .en distil-*.en
large-v3 1550M distil-*

Quick Selection Guide

  • Good balance speed/accuracy
  • 74M parameters
  • Use for: voice dictation, development

Small (better quality)

  • Better accuracy
  • 244M parameters
  • Use for: production, professional apps

Medium (high quality)

  • High accuracy
  • 769M parameters
  • Use for: meetings, subtitling

Large-v3 (maximum quality)

  • Best accuracy
  • 1550M parameters
  • Use for: critical apps (legal, medical)

Distilled (distil-*)

  • 30-50% faster
  • Similar accuracy (~1-3% WER difference)
  • Available: small.en, medium.en, large-v2, large-v3

English-only (.en)

  • English transcription only
  • 20-30% faster than multilingual
  • Better English accuracy

Performance Comparison

Model CPU Time* Memory (RAM) Speed Accuracy
tiny 1x ~1 GB ⚡⚡⚡⚡⚡ ⭐⭐
base 2x ~1 GB ⚡⚡⚡⚡ ⭐⭐⭐
small 5x ~2 GB ⚡⚡⚡ ⭐⭐⭐⭐
medium 12x ~5 GB ⚡⚡ ⭐⭐⭐⭐⭐
large-v3 30x ~10 GB ⭐⭐⭐⭐⭐⭐

*Relative to tiny model on 1 minute of audio


Speaches Integration

Dynamic model loading (specify in API request):

// In transcription.ts
const transcription = await openai.audio.transcriptions.create({
  file: audioFile,
  model: 'base',  // or 'small', 'medium', 'large-v3', etc.
  language: 'en'
});

Available models: - tiny, tiny.en - base, base.en - small, small.en - medium, medium.en - large-v1, large-v2, large-v3 - distil-small.en, distil-medium.en - distil-large-v2, distil-large-v3

Models are auto-downloaded on first use.


Recommendations by Use Case

Use Case Model Why
Voice dictation base or small Fast + good accuracy
Meetings medium High accuracy + multi-speaker
Real-time tiny.en or base.en Low latency
Critical large-v3 Maximum accuracy
Development base Fast iteration

Language Support

Multilingual (99+ languages): - Western European: en, fr, de, es, it, pt - Eastern European: ru, pl, uk, cs - Asian: zh, ja, ko, hi, ar

English-only (.en suffix): - Cannot transcribe other languages - 20-30% faster for English - Smaller model size


Configuration

{
  "language": "en",
  "formatterEnabled": true,
  "transcription": {
    "backend": "speaches",
    "speaches": {
      "url": "http://localhost:8000/v1",
      "apiKey": "none",
      "model": "Systran/faster-whisper-base"
    }
  }
}

References

Version: v2.0 Last Updated: 2025-10-11