LLM Polish

Use AI to add punctuation to your transcripts for better readability.

The Problem

Whisper.cpp sometimes produces transcripts without proper punctuation, especially for conversational audio or when speakers don't pause clearly between sentences. For example:

Raw whisper output (no punctuation):

hello how are you today I am fine thanks for asking what about you

Without punctuation, the transcript is harder to read and understand.

The Solution

The Polish stage uses a Large Language Model (LLM) to analyze the text and add punctuation semantically — based on meaning and context, not just pauses.

After LLM polish:

Hello, how are you today? I am fine, thanks for asking. What about you?

Current Scope: The Polish stage currently only adds punctuation. It does not remove filler words (um, uh, like) or restructure sentences.

Configuration Options

Default settings work for most cases. Click the Advanced button to reveal additional options.

Default Options

Model

Default: google:gemini-2.5-flash-lite. LLM model to use, specified in provider:model format. Choose between cloud-based or local processing:

Cloud Models:

  • google:gemini-2.5-flash-lite — Fast and cost-effective (default)
  • openai:gpt-4o-mini — High quality results

Cloud models require an internet connection and an API key. See API Keys for setup instructions.

Local Model:

  • llamacpp:qwen3-8b — Runs entirely on your Mac

The local model (~5.4 GB) is downloaded on first use. See Local LLM for details.

Privacy: When using the local Qwen3-8B model, your transcript never leaves your Mac. All processing happens on-device.

Advanced Options

Custom Model String

In advanced mode, you can enter a custom model string instead of using the dropdown. This is useful for testing different model versions or configurations. Format: provider:model.

Overwrite

Disabled by default. When enabled, overwrites the original transcription file with the polished result.

Extra Args

Default: --align-with-word-segments --print-result. Additional arguments passed to split_sentence for controlling text chunking behavior before LLM processing.

Tip (thinking / reasoning models): If sentence splitting isn't good enough, try enabling extra “thinking” via --thinking-budget in Extra Args:

  • Gemini (google:gemini-*): --thinking-budget 2048 (integer; default 0 = off)
  • OpenAI reasoning (openai:gpt-5* or openai:o*): --thinking-budget medium (minimal | low | medium | high; default low)

Related