Chapter 8: How AI Creates Voice, Music, and Video
The hum of the office was punctuated by a strange sound — a deep, echoing voice coming from Derek’s laptop. Ivy paused, raising an eyebrow.
“Is that Morgan Freeman?” she asked.
Derek grinned. “Not quite. That’s my own voice — or at least what an AI thinks I sound like if I were narrating a documentary.”
Ivy leaned closer, astonished. “That’s insane! It sounds so real.”
“Welcome to the world of AI-generated voice,” Derek said. “This week, we’re diving into audio, music, and video generation — the side of AI that’s turning creatives either very excited or slightly terrified.”
They sat at the demo station, where Derek pulled up a few tools. He started with a text-to-speech platform. Typing a sentence — "Ivy is the smartest intern we've ever had" — and selecting a custom voice model, the system generated audio in seconds.
Ivy listened to her “AI boss” praise her. She laughed, but also looked thoughtful. “So… this can say anything, in anyone’s voice?”
“Exactly,” Derek said. “That’s where deepfakes and voice cloning become not just fascinating, but dangerous.”
He showed her a side-by-side video: one was a famous actor giving a speech, the other was a deepfake — nearly indistinguishable. Then another: a CEO appearing to make a false statement. “This tech can be used for fun… or fraud,” he said seriously.
To balance things out, Derek turned to music generation. With a few prompts, they created a lofi track with piano and rain ambience. Ivy adjusted the tempo and mood. “Feels like I’m directing a tiny orchestra,” she said.
“Exactly. And that leads us to your challenge,” Derek said, handing her a storyboard template. “You’re going to be a mini-director. Choose a character and emotion. Use AI to generate a short voiceover and background music. You’ll create a 30-second video using stock clips and narration.”
Ivy's eyes sparkled. “So I get to produce and direct?”
“You get to experiment. But also — think about it,” Derek said. “What happens when people can't tell real from fake? Who’s responsible for misinformation when AI gets too good at acting human?”
Later that evening, Ivy uploaded her mini film to the internal Slack channel — a dreamy clip of a girl walking in the rain, with AI-generated music and a soft-spoken narrator reading a poem Ivy wrote. The response from the team was immediate: fire emojis, applause, and comments like “Emotional masterpiece!”
But in Ivy’s mind, there was a shadow too — the realization that this beautiful tool could just as easily be used to deceive. She closed her laptop with a new question in her mind: In a world where reality can be generated, what’s the role of truth?