Local LLM

Run LLM models entirely on your Mac — your data never leaves your device.

What It Does

Shuole includes a built-in llama.cpp server that runs large language models locally. This powers features like LLM Polish without sending your data to external services.

Model Download

The local model (Qwen3-8B, ~5.4 GB) is downloaded automatically on first use. A progress modal appears to show the download status. This is a one-time download.

Privacy: When using the local model, your transcript never leaves your Mac. All processing happens on-device.

Server Status

The LLM status chip in the app header shows the current server state:

Green dot — Server is running and ready
Gray dot — Server is stopped

Click the chip to start or stop the server manually. The server automatically shuts down when you quit Shuole.

System Requirements

For optimal performance with the local LLM, we recommend:

16 GB RAM or more
Apple Silicon Mac (M1/M2/M3/M4) for best speed

The model will run on Intel Macs but processing will be significantly slower.

LLM Polish — Add punctuation using local or cloud LLMs
API Keys — Configure cloud LLM providers instead
Installation — Full system requirements

Local LLM

What It Does

Model Download

Server Status

System Requirements

Related