Audio / Voice Notes — 2025-12-05

What works

Optional transcription: If routing.transcribeAudio.command is set in ~/.clawdbot/clawdbot.json, CLAWDBOT will:
1. Download inbound audio to a temp path when WhatsApp only provides a URL.
2. Run the configured CLI (templated with {{MediaPath}}), expecting transcript on stdout.
3. Replace Body with the transcript, set {{Transcript}}, and prepend the original media path plus a Transcript: section in the command prompt so models see both.
4. Continue through the normal auto-reply pipeline (templating, sessions, Pi command).
Verbose logging: In --verbose, we log when transcription runs and when the transcript replaces the body.

Config example (OpenAI Whisper CLI)

Requires OPENAI_API_KEY in env and openai CLI installed:

{
  routing: {
    transcribeAudio: {
      command: [
        "openai",
        "api",
        "audio.transcriptions.create",
        "-m",
        "whisper-1",
        "-f",
        "{{MediaPath}}",
        "--response-format",
        "text"
      ],
      timeoutSeconds: 45
    }
  }
}

Notes & limits

We don’t ship a transcriber; you opt in with any CLI that prints text to stdout (Whisper cloud, whisper.cpp, vosk, Deepgram, etc.).
Size guard: inbound audio must be ≤5 MB (matches the temp media store and transcript pipeline).
Outbound caps: web send supports audio/voice up to 16 MB (sent as a voice note with ptt: true).
If transcription fails, we fall back to the original body/media note; replies still go through.
Transcript is available to templates as {{Transcript}}; models get both the media path and a Transcript: block in the prompt when using command mode.

Gotchas

Ensure your CLI exits 0 and prints plain text; JSON needs to be massaged via jq -r .text.
Keep timeouts reasonable (timeoutSeconds, default 45s) to avoid blocking the reply queue.

Start Here

Install & Updates

Core Concepts

Gateway & Ops

Web & Interfaces

Providers

Automation & Hooks

Tools & Skills

Nodes & Media

Platforms

macOS Companion App

Reference & Templates

Experiments & Proposals

Audio

Audio / Voice Notes — 2025-12-05

What works

Config example (OpenAI Whisper CLI)

Notes & limits

Gotchas

Start Here

Install & Updates

Core Concepts

Gateway & Ops

Web & Interfaces

Providers

Automation & Hooks

Tools & Skills

Nodes & Media

Platforms

macOS Companion App

Reference & Templates

Experiments & Proposals

​Audio / Voice Notes — 2025-12-05

​What works

​Config example (OpenAI Whisper CLI)

​Notes & limits

​Gotchas

Audio / Voice Notes — 2025-12-05

What works

Config example (OpenAI Whisper CLI)

Notes & limits

Gotchas