How it works
Story Narration
Overview
Story narration demonstrates how AI can be used to generate a multi-character audiobook with different unique voices, from your favorite book PDF
Business Applications
LLM and TTS integration facilitates speech-aware chatbots and embodied AI hardware, such as automated service terminals.
High-Level Technical Workflow
1
Characters Mapping & Dialogue Formatting
Given the PDF/book text, the LLM maps the story's characters with the most suitable voice from a predefined list of voices with distinct personalities. Then it generates a dialogue-like version of the text.
2
Generating Audio Segments
The dialogue is passed in chunks to the TTS model, resulting in multiple WAV files.
The chunks are concatenated into a single file. The audio is cleaned if needed and delivered to the user with a live transcription and a character map.