Talk Mode
Talk mode is a continuous voice conversation loop:- Listen for speech
- Send transcript to the model (main session, chat.send)
- Wait for the response
- Speak it via ElevenLabs (streaming playback)
Behavior (macOS)
- Always-on overlay while Talk mode is enabled.
- Listening → Thinking → Speaking phase transitions.
- On a short pause (silence window), the current transcript is sent.
- Replies are written to WebChat (same as typing).
- Interrupt on speech (default on): if the user starts talking while the assistant is speaking, we stop playback and note the interruption timestamp for the next prompt.
Voice directives in replies
The assistant may prefix its reply with a single JSON line to control voice:- First non-empty line only.
- Unknown keys are ignored.
once: trueapplies to the current reply only.- Without
once, the voice becomes the new default for Talk mode. - The JSON line is stripped before TTS playback.
voice/voice_id/voiceIdmodel/model_id/modelIdspeed,rate(WPM),stability,similarity,style,speakerBoostseed,normalize,lang,output_format,latency_tieronce
Config (~/.clawdbot/clawdbot.json)
interruptOnSpeech: truevoiceId: falls back toELEVENLABS_VOICE_ID/SAG_VOICE_ID(or first ElevenLabs voice when API key is available)modelId: defaults toeleven_v3when unsetapiKey: falls back toELEVENLABS_API_KEY(or gateway shell profile if available)outputFormat: defaults topcm_44100on macOS/iOS andpcm_24000on Android (setmp3_*to force MP3 streaming)
macOS UI
- Menu bar toggle: Talk
- Config tab: Talk Mode group (voice id + interrupt toggle)
- Overlay:
- Listening: cloud pulses with mic level
- Thinking: sinking animation
- Speaking: radiating rings
- Click cloud: stop speaking
- Click X: exit Talk mode
Notes
- Requires Speech + Microphone permissions.
- Uses
chat.sendagainst session keymain. - TTS uses ElevenLabs streaming API with
ELEVENLABS_API_KEYand incremental playback on macOS/iOS/Android for lower latency. stabilityforeleven_v3is validated to0.0,0.5, or1.0; other models accept0..1.latency_tieris validated to0..4when set.- Android supports
pcm_16000,pcm_22050,pcm_24000, andpcm_44100output formats for low-latency AudioTrack streaming.