Can I use my own cloned voice?

Yes, if you have created a custom voice clone in ElevenLabs, it will appear in the voice selection list. You can use any voice available in your ElevenLabs account.

What audio formats are supported?

The integration supports MP3, OGG, WAV, and other common audio formats. The output format is automatically optimized for each connected channel to ensure compatibility.

Does the agent support speech-to-text input?

Voice input processing depends on the channel. WhatsApp and Telegram automatically transcribe voice messages in some cases. You can also configure a separate speech-to-text service for channels that do not provide transcription.

Channels

Voice Channel Setup: ElevenLabs TTS Integration

Add natural-sounding text-to-speech with ElevenLabs so your agent can respond with voice messages on any channel.

Deploy OpenClaw See How It Works

What You Will Get

After this setup, your OpenClaw agent will be able to generate and send voice messages using ElevenLabs' natural-sounding text-to-speech engine. Instead of, or in addition to, text replies, your agent can send audio messages that sound human-like and expressive.

Voice responses add a personal, accessible dimension to your agent. Users who prefer listening over reading, or those in hands-free situations, will appreciate receiving spoken replies. ElevenLabs provides a wide selection of voices with customizable tone, speed, and emotional expression.

This integration works across any channel that supports audio messages, including WhatsApp, Telegram, Discord, and more. You will configure voice selection, audio quality settings, and triggers for when the agent should respond with voice instead of text.

Step-by-Step Setup

Configure ElevenLabs TTS as a voice channel for your OpenClaw agent.

Get Your ElevenLabs API Key

Log into your ElevenLabs account and navigate to Profile Settings to find your API key. Copy the key and keep it secure. If you do not have an ElevenLabs account, create one and choose a plan that fits your expected voice message volume.

Add the Voice Channel

In your RunTheAgent dashboard, go to Channels and select Voice (ElevenLabs). Enter your API key and the system will verify the connection. You will see a list of available voices once the key is validated successfully.

Select a Voice

Browse the available voices and preview them by clicking the play button next to each option. Choose a voice that matches your agent's personality and use case. You can select from premade voices or use a custom voice you have created in ElevenLabs. Set this as the default voice for your agent.

Configure Audio Settings

Adjust the voice parameters including stability, similarity boost, and style. Higher stability produces more consistent output while lower stability adds more expressiveness. Set the output audio format to match your channels, such as MP3 for WhatsApp or OGG for Telegram.

Set Voice Triggers

Define when your agent should respond with voice instead of text. Options include always responding with voice, responding with voice only when the user sends a voice message, or using a keyword trigger. You can also configure a dual-response mode that sends both text and voice for maximum accessibility.

Configure Channel Routing

Select which connected channels should receive voice messages. Enable voice for channels that support audio natively like WhatsApp and Telegram. For channels without audio support like SMS, the agent will fall back to text automatically. Configure fallback behavior in the routing settings.

Test Voice Output

Send a message to your agent through a voice-enabled channel. Verify that the agent responds with an audio message and that the voice sounds correct. Test different message lengths to ensure longer responses are handled smoothly. Check the audio quality on both mobile and desktop clients.

Tips and Best Practices

Keep Voice Responses Concise

Long voice messages can feel tedious to listen to. Configure your agent to keep voice responses under 30 seconds and provide a text summary alongside longer audio when needed.

Cache Common Responses

Enable audio caching for frequently generated responses. This reduces API calls to ElevenLabs and speeds up response delivery for common queries.

Monitor Usage and Costs

ElevenLabs charges based on character count. Monitor your usage in the RunTheAgent dashboard and set up alerts when you approach your plan limits. Adjust voice triggers to optimize costs.

Frequently Asked Questions

WhatsApp Integration Telegram Bot: Advanced Features Multi-Channel Setup

Ready to get started?

Deploy your own OpenClaw instance in under 60 seconds. No VPS, no Docker, no SSH. Just your personal AI assistant, ready to work.

Deploy OpenClaw View Pricing

Starting at $24.50/mo. Everything included. 3-day money-back guarantee.