Why isn't everyone talking about AI generated audiobooks?

posted 1 year ago

I just listened to this AI generated audiobook and if it didn’t say it was AI, I’d have thought it was human-made. It has different voices, dramatization, sound effects… The last I’d heard about this tech was a post saying Stephen Fry’s voice was stolen and replicated by AI. But since then, nothing, even though it’s clearly advanced incredibly fast. You’d expect more buzz for something that went from detectable as AI to indistinguishable from humans so quickly. How is it that no one is talking about AI generated audiobooks and their rapid improvement? This seems like a huge deal to me.

Sort:

Hot Top Controversial New Old

[ - ]

simple@lemm.ee

122 points

1 year ago

A lot of people just aren’t aware of how fast AI is moving. AI voices were pretty meh earlier this year. A lot of people working on the audiobook/voice acting scene have been talking about this though.

permalink

report

[ - ]

driving_crooner@lemmy.eco.br

41 points

1 year ago

I recommend everyone to check the YouTube channel “two minute papers” who have being doing videos about papers on AI for the last 10 years on so to see the accelerated progress AI have. Like 5 years ago those images generating AI looked like LSD infused dreams and now they look almost perfect.

permalink

report

parent

[ - ]

Magrath@lemmy.ca

7 points

1 year ago

I wish I could watch his videos but the way he talks is awful. It’s like some exaggerated evolution of YouTube talk.

permalink

report

parent

[ - ]

Liempong_pagong@beehaw.org

2 points

1 year ago

It’s great to be alive!

permalink

report

parent

[ - ]

mindbleach@sh.itjust.works

4 points

1 year ago

I’m only shocked that video isn’t better. Diffusion models work like denoising - so you’d figure all the wiggly nonsense between frames would be the first thing to filter out.

permalink

report

parent

[ - ]

Turun@feddit.de

4 points

1 year ago

I expect the data size to be a problem. Stable diffusion defaults to 512x512px, because it simply requires a lot of resources to generate an image. Even more so to train one. Now do that times 30 to generate even one second of video. I think we need something that scales better.

I fully expect this to work decently in a few years though, no matter how hard the challenge is, ai is moving really fast.

permalink

report

parent

Show more comments

[ - ]

driving_crooner@lemmy.eco.br

4 points

1 year ago

I give it a year, maybe two, for a fully synthetic video that couldn’t not be easily distinguish from reality. There’s already some very good AI that complete or replace backgrounds on videos that work really good, and completely synthetic videos that looks like nightmares for now.

permalink

report

parent

Show more comments

[ - ]

Bobo@lemm.ee

34 points

1 year ago

Deleted by creator

permalink

report

[ - ]

bionicjoey@lemmy.ca

10 points

1 year ago

If your phone is rendering TTS on the fly that’s probably going to be a drain on battery.

permalink

report

parent

[ - ]

Bobo@lemm.ee

4 points

1 year ago

Deleted by creator

permalink

report

parent

[ - ]

rustyriffs@lemmy.world

4 points

1 year ago

What’s TTS?

permalink

report

parent

[ - ]

lud@lemm.ee

9 points

1 year ago

Text to speech.

permalink

report

parent

[ - ]

milicent_bystandr@lemm.ee

20 points

1 year ago

That sounds pretty cool, though I’d be concerned it will suffer from the classic problem of current AI (…and humans, but that’s by the by) of confident incorrectness. Like an automatic transmission can miss meanings and types of context that a human will spot, programmatically generating speech can probably mess up punctuation and flow - even the way a human reader sometimes will get part way through a sentence and realise they need to start again for it to come out right.

That said, I can’t see it being a big problem for most works, just unfortunate here and there. For once it seems an AI application short on downsides! (Except for the usual economic ones for many people previously trained in the field.)

permalink

report

Why isn't everyone talking about AI generated audiobooks?

Asklemmy

!asklemmy@lemmy.ml

Community stats

Community moderators