Workshop Overview
Voice is one of the most powerful creative mediums, and AI has made it accessible to everyone. This workshop covers voice synthesis from the ground up.
Topics Covered
- Text-to-speech with ElevenLabs: choosing voices, adjusting stability and clarity
- Voice cloning: creating custom voice profiles from audio samples
- Open-source alternatives: Coqui TTS, Bark, and Tortoise TTS
- Building a narration tool: combining LLM-generated scripts with voice synthesis
- Adding emotion and expression control
- Ethical considerations: consent, disclosure, and deepfake prevention
What You'll Build
By the end of this workshop, you'll have a working prototype that takes a text prompt, generates a script with an LLM, and reads it aloud with a custom AI voice.
Prerequisites
- Basic JavaScript or Python knowledge
- An ElevenLabs API key (free tier gives you 10,000 characters/month)
- A microphone if you want to try voice cloning with your own voice