The subject of AI reading audiobooks has come up a number of times, and today I bring you an interesting article from Brian’s Book Blog: Audible’s Virtual Voice is Flooding the Market.
Brian talks in depth about the potential harm and ramifications of relying on the Virtual Voice to read audiobooks and he backed it up with some interesting numbers. We support his position, with the caveat that Virtual Voice may become a necessary tool for visually impaired readers who otherwise wouldn’t have access to the written text.
So what does Virtual Voice sound like? Here is an example. Gordon had a very strong reaction to it. It was a no right away. I listened to about a minute before the weird flatness got me. It just doesn’t have the warmth and range of a human voice.
We will not be using Virtual Voice in the production of our audiobooks. With that in mind, I bring you the voice of Roman. This is a cold audition, with Chris reading excerpts with no context. Before everyone gets upset, we know he mispronounced a couple of things, specifically Shinar. I forgot to give him the audio clip for it. No worries, all that will be fixed in the final recording.
The unfortunate reality of all of this is that AI is here to stay. As Mod R put it, “It can feel like holding a finger in the dyke against the deluge: ultimately the indie market will choose the AI covers, and AI voices, and writers will use ChatGPT to edit and proofread, because people want it fast and free.”
In the commercial space, fast and free usually wins. Midjourney is already capable of generating images which most people without an artistic background would find captivating. It can make a cover in seconds. No need to pay the artist, especially if you are just slapping together an object cover. Adobe Photoshop now comes with an AI generator built in. Check this out:
One of the authors I’ve known for years just put out an indie book with an AI cover. I do feel that she probably paid someone to make it, but it’s still an AI cover. I’ve read some 70 comments on her Facebook post. Nobody called her on it. People thought the cover was pretty. Whoever made it didn’t even bother to file off the proverbial serial numbers – it has Midjourney standard face and weird lighting stamped all over it.
Right now someone is probably reading a Chat GPT novel. And who knows, it might do the job. Although that might be a bit off. I’ve asked Chat GPT to continue our writing sample.
Our text:
Please write 2 paragraphs continuing the following scene:
Snow crunched under his feet. It spread in front of him like a glittering blanket, a foot deep, sheathing the vast plain he was crossing, and he sank a little with every step. Above, a night sky gaped like a hole in existence, a spray of stars floating in its black depths.
He didn’t know how long he had been walking. It felt like forever. He didn’t know his destination either. He only felt it, pulling him like a magnet toward the dark wall of colossal fir trees at the edge of the plain.
Step. Another step.
Bitter cold bit at his face. His nose had gone numb, and he could barely feel his fingers in thick red mittens as he clutched the rope pulled tight over his shoulder.
He was holding a rope. Why?
It felt strange somehow.
He stopped and looked over his shoulder. An enormous fir tree lay on the snow behind him. The rope was wrapped around its trunk. Behind it, a long trail of rough snow marked his wake and rolled off into the horizon. He had dragged the tree for miles.
Chat GPT’s response:
The man stood in the silent expanse, his breath forming ethereal clouds in the frigid air. The only sound was the muffled crunch of snow beneath his boots, a rhythmic percussion that matched the beat of his weary heart. The forest loomed ahead, the towering firs casting elongated shadows in the moonlight. A sense of foreboding gripped him, yet the magnetic pull toward the dark woods intensified. The stars above seemed to conspire with the trees, whispering secrets that only the night could understand.
With a hesitant resolve, he resumed his journey, the rope taut in his gloved hands. As he approached the forest’s edge, the trees seemed to close in on him, their branches forming a natural gateway into the unknown. The air grew thicker, laden with the scent of pine and mystery. He hesitated for a moment, peering into the darkness beneath the canopy. A subtle rustling echoed from within, a symphony of unseen creatures awakening in the night. The man took a deep breath, summoning the courage to step into the shadowy embrace of the towering firs.
The forest swallowed him whole, the snow-covered ground muffling his footsteps. The colossal trees stood like sentinels, their limbs entwined in a dance of ancient secrets. As he ventured deeper, the man became acutely aware of a presence, an intangible force that seemed to guide him through the maze of trunks and shadows. His senses heightened, attuned to the subtle whispers of the forest. The rope in his hands now felt like a lifeline connecting him to the unknown, unraveling a tale written in the language of nature itself. Each step brought him further into the heart of the enigma, and he couldn’t shake the feeling that the answers to his questions lay hidden among the ancient firs.
It is both flowery and overwritten, and oddly sappy somehow. A word salad. Chat GPT apparently quite loves the sound of its own prose, heh. Don’t we all.
The supporters of AI often accuse their opponents of being Luddites and trying to hold back progress. And in some sense, they are right. The original Luddites protested automation that led to the production of inferior products. We already have a proliferation of speed-written books out there. If Chat GPT becomes widely used, KU will be flooded with its offerings.
Why would any human bother to read something another human didn’t bother to write? Or to paint or to record?
Where does it leave all of us working in a creative field? I don’t know. We will see.
Carrie B says
1. The virtual voice sample from Angel’s Flight: It’s actually not bad. It’s much better than Text-To-Speech, and honestly I have heard worse human narrators. (Check the sample for Edit Mod R )
This will be invaluable for authors who cannot afford the up-front $thousands$ of hiring a narrator outright, and who don’t yet have the sales history to convince a narrator willing to invest the time/work to do a revenue split.
2. WOOT for Chris’ sample! I’m looking forward to hearing him do Roman’s angry muttering. This is the same Chris that does Simon, yes? Is it Christopher Crutchfield Walker of “The Wire” and “House of Cards” fame? ( https://www.imdb.com/video/vi3609049625/?ref_=nmvi_vi_imdb_3 )
3. Is Roman meant to be pronounced roe-Mahn, as Chris does it? (I remember when the publisher had Renee pronounce Simon as sai-Mahn, and House Andrews said on livestream it should just be plain Si-mon.)
4. I can’t see any difference in the Clean Sweep images, other than expanding and filling the canvas/background. What else is different?
Moderator R says
Hi Carrie,
Whilst I understand your stance and preferences, there is also the moral dimension of loss/devaluation of human vocal artist work and the virtual programs being trained on human data which wasn’t shared with consent. I have edited the shared link you provided, as we don’t want to direct criticism towards any particular narrator in the comments. Thank you for your understanding 🙂
The Roman narrator is Chris Brinkley- he has not voiced any characters for House Andrews before, and is not Christopher Crutchfield Walker, but is set to do wonderful angry muttering 😀
Roman is meant to be pronounced as in the audio sample, yes, it’s a Russian pronunciation https://www.youtube.com/watch?v=p6u9YaV4EQc . I think there’s also a bit of confusion there, as Saiman is not intended by the authors to be pronounced as “Simon”.
The difference is in the background image, but the AI went way above content fill.
Carrie B says
Hi ModR!
Sorry, I didn’t get an email note that you had replied – it must be something in my settings or filters…I fired up the new GA Book5 this morning and it hit me that I hadn’t seen a response about “which Chris is it” and came to look.
I totally understand and empathize on the current shake-up, dread, and possibility of work loss the narrators are facing due to AI. My eldest son wants to be a narrator…we’re staring down the Curran-sized, saber tooth-filled throat of this issue live and in-person daily here at the house.
I just worry about what all this very vocal anti-AI commentary says to the (disabled, elderly, poor, etc) people whose lives are being/soon will be significantly improved by AI.
When they hear/read what the AI boycotters here are saying, are they hearing “The issues you struggle to overcome on a daily basis that AI could solve/is solving are less important than my audiobook hobby”? Are they hearing “You don’t matter”?
It’s probably not a discussion to have down here in the comments, though.
Did I get confused about the Simon pronunciation? (It’s probable – I have ADHD.) I could swear they addressed this at one point during a Live, and Gordon always pronounces it just “Si-mon” when he’s talking, like the same as “Simon & Garfunkel” or “Simon Cowell”. Ah well.
Could you pass a request through to the powers that be at Graphic Audio asking them to post a *full* list crediting the actors and which characters they voice on each book’s webpage at GA?
Or if that would be too troublesome, maybe they could just send the master list to you to copy/paste it here on the blog, so the Horde could have somewhere to reference it?
They list off a *few* of the actor/character pairings at the end of each audiobook (and then just a simple list of the remaining actors), but there’s nowhere that we can hear/read the *full* list. Inquiring Horde minds want to know, lol!
Moderator R says
I have passed the request on and I think it will be accommodated by GA! 🙂
Carrie says
As has been stated previously by others, I think AI can be useful for people with disabilities and for things like textbooks, summarizing reviews (Amazon clearly labels their summary of product reviews as AI generated) and party helpers (brilliant of mom to use it in that way!).
At the very least, all AI products should be labeled as AI so you know what you’re getting.
Last week, I was tempted by a Facebook ad into downloading an app I had not heard of for audiobooks. The first couple of chapters were interesting but predictable. That’s fine, I like predictable sometimes. But the longer the book went on, the more holes showed up in the story. And the narration was definitely AI even though it wasn’t flat. (The same happy female voice that you sometimes hear on Facebook reels was the narrator.) She was perfectly cheerful even when the main character was crying over her cheating fiancée and his pregnant girlfriend. 😄 It sounded ridiculous since it did not match the tone of the story. I stopped listening at chapter 4 and deleted the app. Nowhere did I find any reference to AI, yet it had to be. The holes in the story and some references made about the FMC’s career demonstrated a clear lack of understanding that humans would have intuited.
Nanette says
Chat GPT. vomitous. What scares me is how many people , sigh, can’t seem to tell good writing from bad…
Also thank you for the Elemental Vampire hint
Felicia says
Well, you can rest assured that the AI “reading” the book is not sentient.
Momcat says
The writing is flat. Ever try to eat unsalted, cold oatmeal? That’s AI
BSR says
Like most people, not all technical advances are good … not all of them are bad. I can see a place for AI in some industries more than others (definitely not writing and music). I remember when Kindle came out … I loved it from the start and to me it was a positive change. Packing and moving my library was a major undertaking.
I see some parallels to the days when outsourcing jobs started. There was all the hype of it being good for the “global economy” and similar nonsense. In the end, outsourcing hurt quite a number of workers regardless of how much it helped the global economy. I see very little benefit for most people, overall.
Another area would be audio books. Definitely a need for them in certain situations but they will never be a choice for me as long as I can read. I develop my own voices in my mind when I read and an audio books just messes that up for me. But this I think of as a good technology because it helps any number of people, as others have mentioned.
Finally, it comes down to not being a question good or bad but more a question of “just because we can do something, does that mean we should do something”. In my mind, the obvious answer is no … but it’s a fine line … and that’s just my opinion.
The one thing several pointed out which is very pertinent to me … style of writing. I think all of us dedicated Ilona Andrews fans could recognize their books even without the author name being on it. AI’s ability to copycat such things is what scares me … and why I think it needs to be kept out of certain areas.
Kristen says
I hear you on preferring reading to audiobooks – I’m much the same – but my spouse prefers audio. He’s a bit picky with narrators but will listen to almost anything read by the ones he likes. Some of them I listen to by dint of being around him and the audiobooks that are more of a performance than just reading are the best, imo. I’ve heard one or two that had more than one person to accommodate all the voices. It’s probably as close as we’re likely to get these days to radio shows like they were before TV.
PS – I like the voice for Roman, but Evdokia sounded a little…shrill, I think? Maybe it’s an internal bias but I preferred Ilona’s voice for her. Perhaps it’s the cadence of the complaints to/about Roman 😉
BSR says
Oops … didn’t realize how much I was babbling until I saw it posted. Sorry about that.
Terri says
Just like the text to voice, the AI/Chat GPT generated paragraphs lack any emotion, human understanding, or sense of timing/rhythm. To me, that is what truly distinguishes it from real writing or voicework. Sure, people can read the garbage, or listen to the flat/emotionless voice, but if you want to have your mind, heart and soul engaged, it takes a real human to do that.
Yvonne says
I love the variability of Chris’s character voices, great choice!! And I could not agree more about the flat or overly embellished nature of … well, pretty much all AI-generated creative efforts. I imagine machines will be able to completely replace us eventually, but not until they’re fully sentient and able to experience the human condition 😉
Katie F says
My employer has an entire series of trainings read by an unknown type of virtual voice and a synthesized video of a “human just like me TM”
It is seriously creepy, with an uncanny valley effect that I can’t put into words. Allegedly this was done because people pay more attention to trainings read by a human, but I would argue that this is worse than asking me to review straight text on a powerpoint.
I end up spending all of my training staring in horrified fascination at the virtual simulacrum trying to catch it in the act of being inhuman. So far I’ve retained nothing, two more virtual trainings to go.
Debra says
The virtual voice started out just too slow, and then came the Chinese woman, a recent immigrant, who didn’t have an accent. And then I stopped listening. I am willing to pay for books, narrations, art, and music created by actual people. I can see the advantage of virtual voice for people who have impaired vision, especially when books voiced by real narrators aren’t available.
Donna says
In re: the AI voice – ‘I’m sorry, Dave, I can’t do that..’ springs immediately to mind. Not in a good way….
Susann says
Chat GPT is a program that takes existing texts and words with are stored in databases on computers and puts them together in a sentence. The words and texts were put there by humans. It’s like cutting letters from 1 million books to make one.
IndiraIsReading says
I am scared that I will inadvertently buy an AI book and not even notice. It feels like I now gotta bet everything by having others recommend books, and I got no resources I can get to do that myself.
Lenore says
To begin, honestly, I don’t listen to audio books as the voices are always wrong. The virtual voice actually sounded better to me, if I had to listen to an audio book, precisely because it is flat and I know it’s fake so it doesn’t bother me so much.
Terrie C says
I think some of the offerings on Kindle Unlimited are already AI generated. It can write a scene, but the character, emotion and feeling is missing. That’s why I skim a lot, and then come back to IA, there is something missing, and I cannot connect to it
Monica says
Using AI to create something you offer for sale is the height of plagiarism. If I ever find out an author is using AI to write his/her books, that author will then be blacklisted by me. AI is ok for one’s own personal enjoyment; using it to sell creative endeavors is just gross.
Kate says
I love Chris Brinkley!! He’s so talented. He’s read a lot of Penny Reid’s books, that’s where I first heard him. He’ll be perfect for Roman!! Can’t wait to hear it!
Lee M says
I adore well-acted/voiced audiobooks—but I’ve also relied on Apple’s “Spoken Content” screenreader capabilities to read ebooks and other digital content to me for ~5 years now (specifically, the “Karen” and “Tom” voices), so this trend is kind of an odd one for me to watch, since I’ve essentially had an automated voice narrate books not available as audiobooks to me for years.
Possibly most relevant—Even though more nuanced/advanced voices are available (the ‘enhanced’ and ‘premium’, I still use the slightly stilted (compared to newer versions) original Karen voice. I’m not alone in this! Several of my peers who use screenreaders have an attachment to a particular voice or another that they’ve used for a long time.
And while I’ve always thought of this feature as an unbelievable boon for making content I couldn’t otherwise consume accessible, in the context of this post, I can see how it’s deeply worrying for the audiobook industry, because my experience is proof of how quickly you can get used to a flat voice reading. Though how much of there we “the alternative was nothing” I can’t say.
I do think audiobook listeners are in the minority of readers? And I don’t know about anyone else but I’ve definitely followed narrators I’ve really liked and bought books I wouldn’t have otherwise — just as I’ve not purchased or set aside ones with narrators whose voices grate. I think that idea of specific voices will be kind of interesting to see.
Tom C says
Oh, that’s _not_ how Shinar is supposed to be pronounced? I have been saying that same way… Anyway, yes, that AI voice is bad, dull and bland. I can put up with TTS when I need it to read a book for me while I’m otherwise occupied. Actually I like the TTS voices better: less inflection maybe, but less uncanny valley too.
Nineran says
Fwiw before you feed the beast with an offering of your writing, please be aware that it may never forget (depending on the version of chatGPT you used). And it may spit it out, word for word with the right prompt. And unless you have a norobots setting going, a version of GenAI is likely every word on this website. You probably have enough words /content to finetune an AI with. I bet if you finetuned an AI with your collected works, it could imitate you better, just like a law-trained AI can match an A- student but ChatGPT by itself is only a C+ law student. (Both of those are scary for various reasons).
I don’t always understand why we are in such a hurry to make an entity that could replace us. Or worse, someday getting good stories out of AI will be our skill, like playing the harp, and we’ll rejoice. And I really dislike how this simple litany of facts makes me sound like a conspiracy theorist.
Ilona says
Nineran, it scraped all of our books. It was illegally fed our entire catalog from a pirate site. That train has sailed.
Di says
I’ve encountered quite a few books that are clearly not entirely written by the given author. One book in particular supposedly written by a woman had portions no woman could write – it was clearly written by a man. Now I wonder if it was an AI. I think some of these authors have a story line but are in such a hurry they cut corners using AI, or ghost writers. After encountering several of these dismal novels thru Amazon recommendations I no longer trust them.
I would much rather reread great books like House Andrews writes!
Just to say I REALLY appreciate the time and effort you put in. Thank you!
Syrus_Orelio says
yeah I already knew ai voices were bad and not for me. In middle school I was having a hard to reading one of my assigned literacy books because it’s was sooooo boring and when something is boring I have a hard time concentrating my mind tries to go off somewhere else and it didn’t help that I had only been diagnosed with ADHD and was only getting treated for that I pretty much had to learn how to cope with my autism myself because at them time autism want well known by the time it was becoming more familiar and I was diagnosed in high school Is already developed some good coping skills.
So my bff suggested using an AI reader app she used for such situations. growing up my dad would read to me in a wide variety of voices(preferred him over mom as she had a very limited no m number of voices) and he discovered audiobooks on tape(before audible became a thing because of the tapes the whole family jumped on audible when it came out) so listening to AI read was frustrating. At the time it was actually worse than the sample provided here if you can believe it. it would have random pauses where they weren’t needed and longer continuation where it needed pause. but the monotone emotionless voice was the same. I always listen to samples of audiobooks before I buy as the have been narrator I just couldn’t listen to for one reason or another too high voice or w narrator that was emotioless and while I could tell it was a person over ai it was too close to sounding like ai. but that like a very low percent of narrators and I won’t name names even if I could remember which. but I did pass on some good looking audiobooks and just read the ebook or physical book over listening to the audiobook in a few cases.
Dr V says
I have a series that I love and have been reading for many years. Last year another book in the series was released and I got the audio version. It was an AI voice. Nothing about it being an AI voice was mentioned and it sounded similar to previous books but I couldn’t make it through the book. I haven’t gone back to that series since.
Olivia says
Blind woman weighing in! OK, so I’ve been blind since toddlerhood, which means I grew up with synthesized speech. It’s come quite a long way since the 90s, I must say. Be that as it may, when it comes to my audiobooks, I will always prefer a human’s voice.
Synthesized speech can have inflection and show a limited range of emotion, but in the end, it’s a computer program. Many text to speech programs have a dictionary where you can phonetically spell things out so it isn’t so jarring to hear. For example, I seriously wanted to slap VoiceOver when it went through that phase of saying “tornado” like tor-raw-do.
AI is great: it’s what allows me to write this comment even though I can’t see the screen. When it comes to creative fields, though, humans have no equal.
Lacey Pfeffer says
I am anti AI in a creative space. The only thing I really approve of technology being used for is medical advancements. I hope when it comes to books the public will want more than cheap, that they will want quality. I feel confident that a large part of the book loving community will not be willing to give good authors up. The BDH isn’t going anywhere.
Linda says
I can relate to Gordon so well… :(( I HATE and cringe at the AI voice so much, I cannot tell you….
So, a bit of background: in my country due to some unfortunate 50 or so years of a very well-known regime in the 20th century, we had a lot of TV channels (nature, history, whatever floats you) even up till last year dubbing their content in a language that is a remnant of that regime. Due to recent events in Europe, our country finally manned up and made a law – everything that’s originally non-national language needs to either stay in original language (99% of the time it is English) or be dubbed in our national language.
You can see where this is going, right.. right? Instead of using normal people for dubbing, the channels have purchased an AI that translates and voices automatically. And it is horrendous…
First of all, zero emotion. And, let me tell you, I cannot listen to nature channels in such a sociopath dub for more than a few mins… Second, in our language, we conjugate numbers as well to match what they’re describing (so, numbers that describe feminine and masculine/singular and plural nouns will differ in their endings) and that stupid AI cannot conjugate those accordingly (it just leaves number in what would be the basic form, so it is incredibly frustrating to listen to because it doesn’t sound logical), and I cannot even begin to tell you – the history channels are ruined now as well…
Carla says
I have listened to an AI read novel to see what that was like and it is weird and soulless.
From the example you provided so is the writing.
I am not at all a fan.
Toni says
So going forward I would like the pop up…I AM NOT A ROBOT
before I purchase an audio book🤯
Shay says
Imo, I am not interested in reading something that somebody didn’t even have the interest in writing themselves. As Miyazaki once said, “[AI] is an insult to life itself.”
AJ says
ChatGPT never met an adjective or adverb it didn’t love. On subreddits where people post travel itineraries for feedback you can spot the ones made by ChatGPT instantly just from the adjective abuse.
Nik says
I don’t think I would enjoy listening to an entire book narrated by Virtual Voice, though I can see why some people might find it useful. I will continue to buy audiobooks because to me a good narrator makes a big difference to the experience.
A series that I love recently had the final book released. Prior to the release I did a full reread of the series via the audiobooks. The narrator (Khristine Hvam) is amazing and really brings the characters, and especially their accents, to life with the voices that she uses. To such a degree that I actually had trouble starting the final book on Kindle because my internal reading voice did not compare.
SuperJD says
The narrator is everything with audiobooks. I can barely listen to some (despite the book itself being good) when the narrator isn’t on target. So…. I guess what I’m saying is nope on the AI audiobook.
sage says
I recently watched a YouTube channel in which the creator lamented not having any room com movies. So she got chatGPT to write 3 rom com scripts. They were so “bad” that when she and her friends got together for the first cold read, they rewrote the entire scripts.
Another YouTuber that I watch has been playing around with image generating programs for a while now. He decided to not do it anymore a couple of months ago. It generated an image that he could not decide if it was copyright infringement or not. So he said he did not want to cross that line for ethical reasons. Other YouTubers spoke of the legal reasons and ramifications. One spoke about the possibility of losing the income from a video that took a month to do, because someone else said it was their work, not his. He said it took three years to prove to YouTube officials that it was him standing in front of the camera, talking about the things around him and reacting to what was happening.
Sorry, that was a long rant
Liz Rohonczy says
RoMAN? Is the emPHAsis on the right sylLAble? lol Sorry but I’m just having a hard time with it. But he is your character and book!
Loving the story as always!!
Liz says
Reading the sample, I am certain I have read Chat GPT novels and it explains why I have found myself skimming or not finishing stuff – it reads just like that and it doesn’t sound natural.