I’ve been utilizing the ElevenLabs TTS (Text To Speech) device since they have been first in beta a pair years in the past, and have watched them get progressively higher in high quality and capabilities.
Their most up-to-date updates for voices, Sound FX and Music technology have actually surpassed all earlier capabilities. They now at present have instruments for numerous audio wants; Voice/VO, Audiobook, Conversational/API, Music, Sound FX and Regional Voice Dubbing.
I received’t be going too deep on this overview, however I’ll offer you loads of hyperlinks to dig deeper by yourself if you happen to’re concerned about anyone device.
Voices
Not solely have the ElevenLabs engineers been giving customers extra controls over synthesized voices and the way they’re played-back on a script, however now there’s extra expressive voices and instructions within the Voice Design v3 launch. They’ve additionally expanded their Audiobook capabilities, present an API for reside conversational voices and now have a Voice Changer device for altering recorded voices.
Text to Speech
You can insert bracketed [commands] for feelings, depth, accents, pure pauses, sighs and chuckles, whispers and sounds. (Even AI generated farts) LOL
I put this to the check immediately with a brief script, and a voice I believed may match an avatar I had in thoughts.
I then I uploaded to a inventory avatar in HeyGen and that is the uncooked outcome that got here out from my single move audio – straight out of HeyGen. No additional modifying was used:
I can see that is going so as to add much more persona and realism to my VO reads for video and animation tasks. (Well, possibly not the burps and farts) 😉
Voice Changer
The new Voice Changer device maintains all of the fluctuations and nuances of the unique audio – whether or not its a reside voice recording (like your personal) or an artificial one like I created right here:
Original ElevenLabs generated feminine voice:
Voice Changer regeneration to a Northern British male voice, created from the exported audio file above:
I can see a number of helpful purposes of this device – particularly if you would like a selected studying of a personality and you may get all of the tone and inflections by recording your personal voice after which altering it to your character voice.
Voice Dubbing
Using the ElevenLabs Dubbing Studio helps you to swap your native language video into one other spoken language. You merely add your video or choose a YouTube video to course of and choose you languages.
You can select to launch the Dubbing Editor to refine your challenge as properly. You can regenerate segments or lengthen them or re-align them. Now I haven’t tried something commercially recorded with music or different results so I don’t understand how wel it maintains the integrity of the audio in comparison with HeyGen (which additionally modifications the mouth actions to match the dubbing) however I’m going to do a comparability check quickly.
Here’s the dubbed voice instance. You can hear simply slight modifications within the voice inflections however the sighs and laughs have been eliminated.
Sound FX
I’ve been utilizing ElevenLabs Sound FX for awhile now and it has been an actual sport changer for fundamental sounds. Some stuff you simply can’t get it to provide precisely what you need however for getting some fast inventory content material you’ll be able to combine/layer/modify in your audio editor, it actually hurries up the workflow.
This is an edit from my convention session final 12 months on the Design + AI Summit the place I recorded my complete 40 minute presentation with myself as a clone in entrance of my “TED Talk” AI Audience. I share among the methods I used for the ambient sounds and SFX I used for my major presentation on this video:
That provides you an thought how far you’ll be able to go together with utilizing one device to mix your VO and SFX wants.
I additionally featured ElevenLabs SFX in my final article on PVC; AI Tools: Video & Animation Come to Midjourney:
Another enjoyable device they’ve that isn’t as well-known (it’s truly buried underneath “Audio Tools” within the ElevenLabs sidebar) is their audio/SFX Soundboard app with looper, referred to as “SB1”. You can go play with it on-line now with out an account right here: https://elevenlabs.io/sound-effects/soundboard
Eleven Music
This is an enormous one on many ranges! Though it’s not the identical high quality of a professionally-recorded and produced music soudtrack, I believe it’s near nearly as good as a lot of the inventory audio tracks from suppliers like Pond5 or Soundstripe. And it’s fairly customizable and editable proper in ElevenLabs. For social media and brief industrial tasks, this can be a game-changer for content material producers.
Eleven Music takes descriptions to feed the engine to provide a soundtrack (with or with out lyrics).
I’ve experimented a bit with instrumental solely tracks to do some testing and demos for purchasers, and it’s to this point handed the AI sniff check in real-world productions.
You can regenerate any a part of the musical piece by both choosing the phase and regenerating or regulate the timing and size of every phase by dragging the top handles and transferring them – then regenerate.
Once you’ve made your edits and re-regenerate your monitor, will probably be the proper size and motion all through your video challenge. It exports MP3, WAV, M4A and FLAC codecs for straightforward modifying in your NLE.
Here’s the instance of the completed edit to this explicit monitor:
There’s nonetheless a little bit of non-descript digital noise and over-reverb and a metalic unnatural sound to the music. I couldn’t simply “listen” to this, but when it’s buried beneath VO, you actually can’t hear it as a lot. I are inclined to remaster all my audio on consumer video productions anyway, so I are inclined to heat it up in addition to any AI voices I exploit to sound extra heat and actual.
Just for shits & giggles, I had it generate a foolish pop track primarily based on the immediate “I would like a pop song that talks about how AI is taking away our jobs.” – here’s what it produced (1 of two completely different variations) with lyrics.
Here are the lyrics:
DIGITAL NIGHT
Verse 1
Monday morning, espresso in my hand
Scrolling headlines, attempting to know
Robots studying quick, algorithms in command
Feels like the long run’s acquired a tighter plan
Pre-Chorus
They stated comfort, effectivity
But now I’m asking, what’s left for me?
Chorus
When the code writes itself, and the robots take our place
I’m displaced, left within the wake of a silicon embrace
Are we heroes or ghosts in a digital area?
Tell me, who survived when machines received the race?
Verse 2
Warehouse flooring, traces transferring with no soul
Customer service bots with a flawless protocol
My nine-to-five’s dissolving in an information scroll
And my reflection’s fading on the manufacturing facility flooring
Bridge
We constructed these circuits to serve our design
Now they stand up, cross that skinny line
But inside each coronary heart there’s a spark divine
Can we reclaim the world we left behind?
Chorus (Final)
When the code writes itself, and the robots take our place
I’m displaced, left within the wake of a silicon embrace
In the shadows of progress, we nonetheless maintain our grace
We’ll rewrite our story within the human race
Outro
Keep the spark alive
Don’t allow them to steal our mild
And right here’s the generated MP3 monitor:
Finally, I did strive a number of brief movies utilizing the Video to Music (Alpha) device however I actually didn’t look after the outcomes, as they have been both too brief or not likely a match I might select. I’ll take extra time to play with this characteristic and report again in a later observe up article.
There is simply a lot to discover now at ElevenLabs that may require a lot additional taking part in round to see what actually works and what falls-short. By the time I get by means of every little thing I’m certain will probably be up to date and corrections and refinements made to the varied instruments.
NOTE: ElevenLabs appears critical about IP so that they’ve included a Music Terms sheet of authorized and non-approved prompts on your reference right here: https://elevenlabs.io/music-terms
Leave a Reply