🎄Use AI voice on your tutorial vidoes🌟
If for whatever reason you don't like your voice, or you want to make videos programmatically without recording your own voice. You can use AI voice to generate voiceover for your videos.
Voice APIs 🎤
There are multiple voice APIs available:
Most of them have similar pricing and some have free tiers. APIs and UX is also similar, you send a text and get an audio file back.
The are competing in terms of quality (how natural the voice sounds), languages supported, and speed.
Example
Let's take a look at the example on how to use it with Azure API.
in python Azure has API client
import azure.cognitiveservices.speech as speechsdk
the simplest way to use it is to create a client and call synthesize_speech_to_file
method
speech_config = speechsdk.SpeechConfig(subscription="your-subscription-key", region="your-region")
audio_config = speechsdk.audio.AudioOutputConfig(filename="file.wav")
synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
synthesizer.speak_text_async("Hello world")
Aligning video and audio
The problem with the above approach is that you get a single audio file, but you need to align it with your video.
there are 2 ways to solve it:
- split your script into multiple files and get multiple audio files back.
then put audio files in at correct times.
- use bookmarks. add bookmark markers into your script, and text-to-speech software will give you timecodes for it.
then you can put vidoes or files at the correct timecodes.
How to stitch audio and video together
There are multiple ways to do it, but the easiest way is to use ffmpeg if you want to do it programmatically.
ffmpeg -i video.mp4 -i audio.wav -c:v copy -c:a aac -map 0:v:0 -map 1:a:0 output.mp4
or on Mac you can use iMovie if you prefer drag and drop.