LLM Polish
Use AI to add punctuation to your transcripts for better readability.
The Problem
Whisper.cpp sometimes produces transcripts without proper punctuation, especially for conversational audio or when speakers don't pause clearly between sentences. For example:
Raw whisper output (no punctuation):
hello how are you today I am fine thanks for asking what about you
Without punctuation, the transcript is harder to read and understand.
The Solution
The Polish stage uses a Large Language Model (LLM) to analyze the text and add punctuation semantically — based on meaning and context, not just pauses.
After LLM polish:
Hello, how are you today? I am fine, thanks for asking. What about you?
Current Scope: The Polish stage currently only adds punctuation. It does not remove filler words (um, uh, like) or restructure sentences.
Configuration Options
Default settings work for most cases. Click the Advanced button to reveal additional options.
Default Options
Model
Default: google:gemini-2.5-flash-lite. LLM model to use, specified in provider:model format. Choose between cloud-based or local processing:
Cloud Models:
google:gemini-2.5-flash-lite— Fast and cost-effective (default)openai:gpt-4o-mini— High quality results
Cloud models require an internet connection and an API key. See API Keys for setup instructions.
Local Model:
llamacpp:qwen3-8b— Runs entirely on your Mac
The local model (~5.4 GB) is downloaded on first use. See Local LLM for details.
Privacy: When using the local Qwen3-8B model, your transcript never leaves your Mac. All processing happens on-device.
Advanced Options
Custom Model String
In advanced mode, you can enter a custom model string instead of using the dropdown. This is useful for testing different model versions or configurations. Format: provider:model.
Overwrite
Disabled by default. When enabled, overwrites the original transcription file with the polished result.
Extra Args
Default: --align-with-word-segments --print-result. Additional arguments passed to split_sentence for controlling text chunking behavior before LLM processing.
Tip (thinking / reasoning models): If sentence splitting isn't good enough, try enabling extra “thinking” via --thinking-budget in Extra Args:
- Gemini (
google:gemini-*):--thinking-budget 2048(integer; default0= off) - OpenAI reasoning (
openai:gpt-5*oropenai:o*):--thinking-budget medium(minimal|low|medium|high; defaultlow)
Related
- Transcription — The prerequisite stage
- Local LLM — Run models locally on your Mac
- API Keys — Configure cloud LLM providers