Migrating from openai-whisper to faster-whisper for Faster Transcription
I’ve been running a skill that automatically generates meeting minutes from video files. The audio transcription component was using openai-whisper (OpenAI’s official Whisper CLI), but after migrating to faster-whisper, processing speed and memory efficiency improved significantly.
This article covers the motivation for the migration, the specific changes made, and the results.
Motivation
openai-whisper is OpenAI’s official implementation, and it’s appealing for its simplicity—you can transcribe audio with a single whisper command. However, in practice, I had the following concerns:
- Slow processing: Transcription of long videos (over 1 hour) takes a considerable amount of time
- High memory usage: Being PyTorch-based, model loading consumes significant memory
faster-whisper is a reimplementation based on CTranslate2, claiming up to 4x speedup and lower memory consumption.
Changes Made
Dependency Change
# Before
pip install openai-whisper
# After
pip install faster-whisperopenai-whisper depends on PyTorch, which means large download sizes during installation. faster-whisper is CTranslate2-based with lighter dependencies.
From CLI to Python Script
With openai-whisper, I was calling the whisper CLI command directly. Since faster-whisper is provided as a Python library, I created a wrapper script:
import argparse
def format_time(seconds):
h = int(seconds // 3600)
m = int((seconds % 3600) // 60)
s = seconds % 60
return f"{h:02d}:{m:02d}:{s:05.2f}"
def main():
parser = argparse.ArgumentParser(
description="Transcribe audio using faster-whisper."
)
parser.add_argument("audio_path", help="Path to the audio file to transcribe.")
parser.add_argument("--language", default="ja", help="Language code (default: ja)")
parser.add_argument("--model", default="large-v3", help="Whisper model name (default: large-v3)")
parser.add_argument("--output", default="meeting_audio.txt", help="Output text file path")
parser.add_argument("--device", default="cpu", help="Device to use (default: cpu)")
parser.add_argument("--compute-type", default="int8", help="Compute type for quantization (default: int8)")
parser.add_argument("--beam-size", type=int, default=5, help="Beam size (default: 5)")
args = parser.parse_args()
from faster_whisper import WhisperModel
print(f"Loading model '{args.model}' on {args.device} ({args.compute_type})...", flush=True)
model = WhisperModel(args.model, device=args.device, compute_type=args.compute_type)
print(f"Transcribing '{args.audio_path}' (language={args.language})...", flush=True)
segments, info = model.transcribe(
args.audio_path, language=args.language, beam_size=args.beam_size
)
print(f"Detected language: {info.language} (probability {info.language_probability:.2f})", flush=True)
lines = []
for segment in segments:
line = segment.text.strip()
print(f"[{format_time(segment.start)} -> {format_time(segment.end)}] {line}", flush=True)
lines.append(line)
with open(args.output, "w", encoding="utf-8") as f:
f.write("\n".join(lines) + "\n")
print(f"\nTranscription saved to '{args.output}'", flush=True)
if __name__ == "__main__":
main()Key points:
int8quantization: INT8 quantization is enabled by default, optimizing memory efficiency even on CPUlarge-v3model: Changed from openai-whisper’sturbomodel tolarge-v3. Thanks to faster-whisper’s optimizations, larger models run at practical speeds
Usage Change
# Before
whisper meeting_audio.wav --language ja --model turbo > meeting_audio.log 2>&1
# After
python scripts/transcribe.py meeting_audio.wav --language ja --model large-v3 > meeting_audio.log 2>&1The workflow of running in the background and monitoring with tail -f meeting_audio.log remains the same. However, the completion detection method has changed:
# Before: Complete when log stops updating (ambiguous)
ps aux | grep whisper
# After: Complete when "Transcription saved to" appears in log (explicit)
ps aux | grep transcribe.pySummary
| Item | openai-whisper | faster-whisper |
|---|---|---|
| Engine | PyTorch | CTranslate2 |
| Speed | Baseline | Up to 4x faster |
| Memory | High | Low (INT8 quantization) |
| Interface | CLI command | Python library |
| Model | turbo | large-v3 |
The migration itself was straightforward—including the wrapper script creation, it was completed in a short time. If you’re dissatisfied with openai-whisper’s processing speed or memory consumption, migrating to faster-whisper is well worth considering.
That’s all for today—faster transcription is always better. That’s all from the Gemba.