Whisper Turbo Multilingual (CTranslate2 / Faster-Whisper)
This repository contains the CTranslate2 converted version of Dafisns/whisper-turbo-multilingual-fleurs.
It is optimized for lightning-fast inference and low memory usage using the faster-whisper library. The model was fine-tuned on a mix of Indonesian and English datasets, including Google FLEURS, Common Voice 22.0, and EdAcc (for Indonesian-accented English).
Model Details
- Original Model: Dafisns/whisper-turbo-multilingual-fleurs
- Base Architecture: OpenAI Whisper Large V3 Turbo
- Format: CTranslate2 (INT8 / Float16 quantization)
- Optimization: Up to 4x faster inference compared to standard Transformers with significantly reduced VRAM usage.
Performance (WER)
The following Word Error Rates (WER) were achieved by the original model on the combined test sets:
| Language | Dataset Composition | WER (%) |
|---|---|---|
| English | Fleurs + Common Voice + EdAcc | 9.09% |
| Indonesian | Fleurs + Common Voice | 6.97% |
Installation
You need to install the faster-whisper library to use this model efficiently:
pip install faster-whisper
from faster_whisper import WhisperModel
# Use 'cuda' for GPU or 'cpu' for CPU
# 'float16' is recommended for GPU, 'int8' for CPU
model_id = "Dafisns/whisper-turbo-multilingual-fleurs-ct2"
model = WhisperModel(model_id, device="cuda", compute_type="float16")
# Transcribe audio file
# Setting language='id' ensures the model focuses on Indonesian
segments, info = model.transcribe("audio.mp3", beam_size=1, language="id")
print(f"Detected language '{info.language}' with probability {info.language_probability}")
for segment in segments:
print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")
- Downloads last month
- 38
Model tree for Dafisns/whisper-turbo-multilingual-cf-ct2
Base model
openai/whisper-large-v3
Finetuned
openai/whisper-large-v3-turbo