Faster-Whisper Models Comparison¶
Model Variants Overview¶
| Model | Parameters | Multilingual | English-Only | Distilled |
|---|---|---|---|---|
| tiny | 39M | ✅ | ✅ .en | ❌ |
| base | 74M | ✅ | ✅ .en | ❌ |
| small | 244M | ✅ | ✅ .en | ✅ distil-*.en |
| medium | 769M | ✅ | ✅ .en | ✅ distil-*.en |
| large-v3 | 1550M | ✅ | ❌ | ✅ distil-* |
Quick Selection Guide¶
Base (recommended default) ⭐¶
- Good balance speed/accuracy
- 74M parameters
- Use for: voice dictation, development
Small (better quality)¶
- Better accuracy
- 244M parameters
- Use for: production, professional apps
Medium (high quality)¶
- High accuracy
- 769M parameters
- Use for: meetings, subtitling
Large-v3 (maximum quality)¶
- Best accuracy
- 1550M parameters
- Use for: critical apps (legal, medical)
Distilled (distil-*)¶
- 30-50% faster
- Similar accuracy (~1-3% WER difference)
- Available: small.en, medium.en, large-v2, large-v3
English-only (.en)¶
- English transcription only
- 20-30% faster than multilingual
- Better English accuracy
Performance Comparison¶
| Model | CPU Time* | Memory (RAM) | Speed | Accuracy |
|---|---|---|---|---|
| tiny | 1x | ~1 GB | ⚡⚡⚡⚡⚡ | ⭐⭐ |
| base | 2x | ~1 GB | ⚡⚡⚡⚡ | ⭐⭐⭐ |
| small | 5x | ~2 GB | ⚡⚡⚡ | ⭐⭐⭐⭐ |
| medium | 12x | ~5 GB | ⚡⚡ | ⭐⭐⭐⭐⭐ |
| large-v3 | 30x | ~10 GB | ⚡ | ⭐⭐⭐⭐⭐⭐ |
*Relative to tiny model on 1 minute of audio
Speaches Integration¶
Dynamic model loading (specify in API request):
// In transcription.ts
const transcription = await openai.audio.transcriptions.create({
file: audioFile,
model: 'base', // or 'small', 'medium', 'large-v3', etc.
language: 'en'
});
Available models: - tiny, tiny.en - base, base.en - small, small.en - medium, medium.en - large-v1, large-v2, large-v3 - distil-small.en, distil-medium.en - distil-large-v2, distil-large-v3
Models are auto-downloaded on first use.
Recommendations by Use Case¶
| Use Case | Model | Why |
|---|---|---|
| Voice dictation | base or small | Fast + good accuracy |
| Meetings | medium | High accuracy + multi-speaker |
| Real-time | tiny.en or base.en | Low latency |
| Critical | large-v3 | Maximum accuracy |
| Development | base | Fast iteration |
Language Support¶
Multilingual (99+ languages): - Western European: en, fr, de, es, it, pt - Eastern European: ru, pl, uk, cs - Asian: zh, ja, ko, hi, ar
English-only (.en suffix): - Cannot transcribe other languages - 20-30% faster for English - Smaller model size
Configuration¶
{
"language": "en",
"formatterEnabled": true,
"transcription": {
"backend": "speaches",
"speaches": {
"url": "http://localhost:8000/v1",
"apiKey": "none",
"model": "Systran/faster-whisper-base"
}
}
}
References¶
Version: v2.0 Last Updated: 2025-10-11