Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.jobot.jeppdev.com/llms.txt

Use this file to discover all available pages before exploring further.

JoBot can speak aloud in your voice channel using ElevenLabs text-to-speech. When you ask it to say something out loud, it generates audio from the ElevenLabs API and plays it directly in the voice channel. This requires an ElevenLabs API key and a Voice ID to be configured before use.

Requirements

Before TTS will work, you need:
  • An ElevenLabs API key — sign up at elevenlabs.io
  • An ElevenLabs Voice ID — the ID of the voice you want JoBot to use
See requirements for a full list of prerequisites for running JoBot.

How it works

When JoBot decides to speak — either because you explicitly asked it to, or because it determines a verbal response is appropriate — it calls the ElevenLabs API to generate an MP3 audio clip from the response text. That clip is then streamed into the voice channel JoBot is currently connected to. If music is playing when TTS is triggered, playback pauses while JoBot speaks and resumes afterward.

Triggering TTS

You can ask JoBot to speak in plain English: Example prompts:
  • @JoBot say hello to everyone
  • @JoBot read that out loud in your voice
  • @JoBot tell us a joke out loud
  • @JoBot announce that the game is starting
JoBot interprets these as requests to generate and play TTS audio rather than reply in text.

Configuration

TTS is configured through environment variables. Set the following in your environment:
VariableDescription
ElevenLabs__ApiKeyYour ElevenLabs API key
ElevenLabs__VoiceIdThe ID of the ElevenLabs voice to use
See environment variables for the full list of configuration options and how to set them.
TTS temporarily interrupts music playback. Once JoBot finishes speaking, music resumes automatically.
TTS will not work unless JoBot is already connected to a voice channel. Ask it to join your voice channel first, then request TTS.