ElevenLabs, a startup known for AI voice cloning and text-to-speech API, has launched a platform for building conversational AI bots. Users can now create complete conversational agents with customizable variables such as tone of voice and response length on ElevenLabs’ developer platform.
Key Features
- Customizable Agents: Users can select templates or create new projects, choose primary language, first message, and system prompt to define the agent’s persona.
- Large Language Models: Options include Gemini, GPT, or Claude, with adjustable response creativity (temperature) and token usage limits.
- Voice and Performance Tuning: Users can adjust voice, latency, stability, authentication criteria, and conversation length.
- Knowledge Base Integration: Users can add files, URLs, or text blocks to power the bot and integrate custom LLMs.
- SDK and API: Compatible with Python, JavaScript, React, and Swift, along with a WebSocket API for further customization.
- Data Collection: Companies can define criteria to collect customer data and set evaluation criteria for call success.
Development and Competition
ElevenLabs is leveraging its text-to-speech pipeline and developing speech-to-text capabilities for the new product. Although not yet offering a stand-alone speech-to-text API, it may do so in the future, potentially competing with Google, Microsoft, Amazon, OpenAI’s Whisper, AssemblyAI, Deepgram, Speechmatics, and Gladia.
Market Position
Aiming for a valuation above $3 billion, ElevenLabs competes with voice AI startups like Vapi and Retell, and notably, OpenAI’s real-time conversational API. ElevenLabs believes its customizations and model-switching capabilities will provide a competitive edge.