Frequently Asked Questions

Quick answers to common questions about Shuole.

Is my audio uploaded to the cloud?

No. Shuole runs 100% on your Mac. Your audio files never leave your device.

What languages are supported?

Shuole supports 99+ languages including English, Chinese, Japanese, Spanish, French, German, and more. Language is auto-detected by default.

How long does transcription take?

About 1 minute per 10 minutes of audio on Apple Silicon. Longer files are automatically chunked for better performance.

Does Shuole work on Intel Macs?

Currently, Shuole requires Apple Silicon (M1/M2/M3/M4). Intel support may be added in a future release.

Can I transcribe video files?

Yes! Shuole accepts MP4, MOV, and AVI files. The audio track is automatically extracted for transcription. For advanced pipeline stages like alignment and diarization, we recommend converting to audio format (WAV, MP3) for best compatibility.

What is Speaker Diarization?

Speaker diarization identifies different speakers in your audio and labels segments as 'Speaker 1', 'Speaker 2', etc. You can then map these to real names in the Speaker Database.

How do I exclude parts of the audio?

Use the Timeline feature to visually select ranges you want to skip (e.g., intro music, dead air). These segments won't be transcribed.

Where are my transcriptions saved?

Transcriptions are stored locally in the app's data folder. You can export to SRT or JSON at any time.

What is LLM Polish?

LLM Polish uses a local language model to improve punctuation and fix minor transcription errors. It runs on-device using llama.cpp.

How much disk space does Shuole need?

The app itself is small, but models require additional space: ~1.5 GB for runtime dependencies, ~2.5 GB for transcription models, and ~5.4 GB for the optional local LLM.