‘Impossible’ to create AI tools like ChatGPT without copyrighted material, OpenAI says(www.theguardian.com)

posted 10 months ago

V H@lemmy.stad.social

tech@lemmy.stad.social

19 commentshide report

Sort:

Hot Top Controversial New Old

[ - ]

Dr. Dabbles@lemmy.world

9 points

10 months ago

Sounds like they’re desperate to convince people that copyright law shouldn’t apply to only them. Sorry, but that’s not going to work. License the content you’re making money from, or don’t use it.

permalink

report

[ - ]

V H@lemmy.stad.socialOP

5 points

10 months ago

Possibly. On the other hand, OpenAI’s market cap is bigger than the ten largest publishers combined - despite their whining they can afford to. It’s not OpenAI that will be prevented from getting training data - the biggest impact will be that it might stop smaller competitors and prevent open-source models.

permalink

report

parent

[ - ]

Dr. Dabbles@lemmy.world

0 points

10 months ago

I couldn’t care less what their market cap is, it’s a scam. Ponzi schemes are incredibly valuable until they aren’t

This BS is an obvious attempt to astroturf Lemmy for the benefit of a corporation, and anybody falling for it is an easy mark.

permalink

report

parent

[ - ]

V H@lemmy.stad.socialOP

0 points

10 months ago

Lol, what. OpenAI shares aren’t available - there’d be no benefit to anyone trying to pump them.

permalink

report

parent

Show more comments

[ - ]

TootSweet@lemmy.world

4 points

10 months ago

OpenAI’s market cap is bigger than the ten largest publishers combined

Only until the AI bubble bursts, I expect.

permalink

report

parent

[ - ]

V H@lemmy.stad.socialOP

0 points

10 months ago

Why do you think anything will “burst”? If anything, if licensing requirements for contents makes training expensive it’s likely to make the biggest existing players far more valuable.

permalink

report

parent

Show more comments

[ - ]

humorlessrepost@lemmy.world

2 points

10 months ago

What human author hasn’t read and been inspired by existing copyrighted works?

It’s not even that uncommon for humans to accidentally copy them too closely later on.

permalink

report

[ - ]

linearchaos@lemmy.world

2 points

10 months ago

I fully agree with you. I mean, even search engines are fully reliant on the ingest and storage of copyrighted material.

Of course the elephant in the room is how do we stop multi-billion dollar companies from advancing the technology significantly enough to put artists, programmers, writers and the like out of business.

permalink

report

parent

[ - ]

V H@lemmy.stad.socialOP

4 points

10 months ago

You can’t. The cat is out of the bag. The algorithms are well understood, and new papers on ways to improve output of far smaller models come out every day. It’s just a question of time before training competitive models will be doable for companies in a whole range of jurisdictions entirely unlikely to care.

permalink

report

parent

[ - ]

polyploy@lemmy.dbzer0.com

7 points

10 months ago

Machines don’t have inspiration, they are not people. They do not make decisions based upon artistic choice or aesthetic preference or half-remembered moments, they are plagiarism machines trained on millions of protected works designed for the explicit purpose of putting all those who created what it copies out of work.

In a vacuum AI tools are as harmless and benign as you want them to be, but in reality they are disastrously harmful to the environment to train, and they are already ruining the livelihoods of human creators who actually make art.

permalink

report

parent

[ - ]

V H@lemmy.stad.socialOP

2 points

10 months ago

Whenever I see them described as “plagiarism machines”, odds are about 99% that the person using the term have no idea how these models work. Like with humans, they can overfit, but most of what they output will have have far less in common with any individual work than levels of imitations people engage in without being accused of plagiarism all the time.

As for the environmental effects, it’s a totally ridiculous claim - the GPUs used to train even the top of the line ChatGPT models adds up to a tiny rounding error of the power use of even middling online games, and training has only gotten more efficient since.

E.g. researchers at Oak Ridge National Labs published a paper in December after having trained a GPT4 scale model with only 3k GPUs on the Frontier supercomputer using Megatron-DeepSpeed. 3k GPUs is about 8% of Frontiers capacity, and while Frontier is currently fastest, there are hundreds of supercomputers at that kind of scale publicly known about, and many more that are not. Never mind the many millions of GPUs not part of any supercomputer.

permalink

report

parent

[ - ]

neoinvin@lemm.ee

3 points

10 months ago

nothing wrong with humans doing it. it’s yet to be determined whether machines should be able to.

permalink

report

parent

[ - ]