Transcription

Configure local or cloud transcription providers

Yap supports two transcription providers: Local (using whisper.cpp) and Cloud (using OpenAI’s Whisper API).

Local Transcription

Local transcription processes audio entirely on your device using whisper.cpp.

Advantages

  • Private — Audio never leaves your device
  • Offline — Works without internet connection
  • Free — No API costs or subscriptions

Disadvantages

  • Slower — Processing depends on your hardware
  • Resource usage — Uses CPU/RAM during transcription
  • Less accurate — May struggle with accents or specialized vocabulary

Requirements

You need whisper-cpp installed on your system:

# macOS
brew install whisper-cpp

# Linux (build from source)
git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp && make

Cloud Transcription (OpenAI)

Cloud transcription uses OpenAI’s Whisper API for maximum accuracy.

Advantages

  • Most accurate — Best-in-class transcription
  • Fast — Processing happens on powerful servers
  • No local resources — Doesn’t slow down your computer

Disadvantages

  • Requires internet — Won’t work offline
  • API costs — Pay per usage (~$0.006/minute)
  • Privacy — Audio is sent to OpenAI’s servers

Setup

  1. Get an API key from OpenAI
  2. Go to Settings in Yap
  3. Select Cloud (OpenAI) as your provider
  4. Enter your API key

Tip: OpenAI’s Whisper API is very affordable. A typical month of heavy use costs just a few dollars.

Choosing a Provider

FactorLocalCloud
PrivacyExcellentData sent to OpenAI
AccuracyGoodExcellent
SpeedDepends on hardwareFast
CostFree~$0.006/minute
OfflineYesNo
Resource usageHighLow

Recommendations

  • Privacy-focused users → Local
  • Professionals needing accuracy → Cloud
  • Offline/travel use → Local
  • Low-power devices → Cloud

Switching Providers

You can switch providers at any time:

  1. Open Settings
  2. Under Transcription Provider, select your choice
  3. The change takes effect immediately

Your recordings will use the selected provider. History items retain their original transcription — they won’t be re-transcribed if you switch.