LLM Polish

Use AI to add punctuation to your transcripts for better readability.

The Problem

Whisper.cpp sometimes produces transcripts without proper punctuation, especially for conversational audio or when speakers don't pause clearly between sentences. For example:

Raw whisper output (no punctuation):

hello how are you today I am fine thanks for asking what about you

Without punctuation, the transcript is harder to read and understand.

The Solution

The Polish stage uses a Large Language Model (LLM) to analyze the text and add punctuation semantically — based on meaning and context, not just pauses.

After LLM polish:

Hello, how are you today? I am fine, thanks for asking. What about you?

Current Scope: The Polish stage currently only adds punctuation. It does not remove filler words (um, uh, like) or restructure sentences.

Configuration Options

Default settings work for most cases. Click the Advanced button to reveal additional options.

Default Options

Model

Default: google:gemini-2.5-flash-lite. LLM model to use, specified in provider:model format. Choose between cloud-based or local processing:

Cloud Models:

  • google:gemini-2.5-flash-lite — Fast and cost-effective (default)
  • openai:gpt-4o-mini — High quality results

Cloud models require an internet connection and send your transcript to the provider's API.

Local Model:

  • llamacpp:qwen3-8b — Runs entirely on your Mac

The local model (~5.4 GB) is downloaded on first use. Processing is slower than cloud but keeps your data completely private.

Privacy: When using the local Qwen3-8B model, your transcript never leaves your Mac. All processing happens on-device.

Advanced Options

Custom Model String

In advanced mode, you can enter a custom model string instead of using the dropdown. This is useful for testing different model versions or configurations. Format: provider:model.

Overwrite

Disabled by default. When enabled, overwrites the original transcription file with the polished result.

Extra Args

Default: --align-with-word-segments --print-result. Additional arguments passed to split_sentence for controlling text chunking behavior before LLM processing.

LLM Server Status

When using the local model, Shuole runs a llama.cpp server in the background. You can monitor and control it via the LLM status chip in the app header:

  • Green dot — Server is running and ready
  • Gray dot — Server is stopped

Click the chip to start or stop the server manually. The server automatically shuts down when you quit Shuole.

LLM server starting popup

Related