Faster-Whisper Models Comparison¶
Model Variants Overview¶
Model | Parameters | Multilingual | English-Only | Distilled |
---|---|---|---|---|
tiny | 39M | ✅ | ✅ .en | ❌ |
base | 74M | ✅ | ✅ .en | ❌ |
small | 244M | ✅ | ✅ .en | ✅ distil-*.en |
medium | 769M | ✅ | ✅ .en | ✅ distil-*.en |
large-v3 | 1550M | ✅ | ❌ | ✅ distil-* |
Quick Selection Guide¶
Base (recommended default) ⭐¶
- Good balance speed/accuracy
- 74M parameters
- Use for: voice dictation, development
Small (better quality)¶
- Better accuracy
- 244M parameters
- Use for: production, professional apps
Medium (high quality)¶
- High accuracy
- 769M parameters
- Use for: meetings, subtitling
Large-v3 (maximum quality)¶
- Best accuracy
- 1550M parameters
- Use for: critical apps (legal, medical)
Distilled (distil-*
)¶
- 30-50% faster
- Similar accuracy (~1-3% WER difference)
- Available: small.en, medium.en, large-v2, large-v3
English-only (.en
)¶
- English transcription only
- 20-30% faster than multilingual
- Better English accuracy
Performance Comparison¶
Model | CPU Time* | Memory (RAM) | Speed | Accuracy |
---|---|---|---|---|
tiny | 1x | ~1 GB | ⚡⚡⚡⚡⚡ | ⭐⭐ |
base | 2x | ~1 GB | ⚡⚡⚡⚡ | ⭐⭐⭐ |
small | 5x | ~2 GB | ⚡⚡⚡ | ⭐⭐⭐⭐ |
medium | 12x | ~5 GB | ⚡⚡ | ⭐⭐⭐⭐⭐ |
large-v3 | 30x | ~10 GB | ⚡ | ⭐⭐⭐⭐⭐⭐ |
*Relative to tiny model on 1 minute of audio
Speaches Integration¶
Dynamic model loading (specify in API request):
// In transcription.ts
const transcription = await openai.audio.transcriptions.create({
file: audioFile,
model: 'base', // or 'small', 'medium', 'large-v3', etc.
language: 'en'
});
Available models: - tiny
, tiny.en
- base
, base.en
- small
, small.en
- medium
, medium.en
- large-v1
, large-v2
, large-v3
- distil-small.en
, distil-medium.en
- distil-large-v2
, distil-large-v3
Models are auto-downloaded on first use.
Recommendations by Use Case¶
Use Case | Model | Why |
---|---|---|
Voice dictation | base or small | Fast + good accuracy |
Meetings | medium | High accuracy + multi-speaker |
Real-time | tiny.en or base.en | Low latency |
Critical | large-v3 | Maximum accuracy |
Development | base | Fast iteration |
Language Support¶
Multilingual (99+ languages): - Western European: en, fr, de, es, it, pt - Eastern European: ru, pl, uk, cs - Asian: zh, ja, ko, hi, ar
English-only (.en
suffix): - Cannot transcribe other languages - 20-30% faster for English - Smaller model size
Configuration¶
{
"language": "en",
"formatterEnabled": true,
"transcription": {
"backend": "speaches",
"speaches": {
"url": "http://localhost:8000/v1",
"apiKey": "none",
"model": "Systran/faster-whisper-base"
}
}
}
References¶
Version: v2.0 Last Updated: 2025-10-11