Sarah Silverman Sues Maker Of ChatGPT For Copyright Infringement

[ - ]

-1 points

1 year ago

Like the record labels sued every music sharing platform in the early days. Adapt. They’re all afraid of new things but in the end nobody can stop it. Think, learn, work with it, not against it.

permalink

report

reply

[ - ]

diskmaster23@lemmy.one

11 points

1 year ago

I think it’s valid. This isn’t about the tech, but the sources of your work.

permalink

report

parent

reply

[ - ]

Sagrotan@lemmy.world

1 point

1 year ago

Of course it’s valid. And the misuse of AI has to be fight. Nevertheless we have to think differently in the face of something we cannot stop in the long run. You cannot create a powerful tool and only misuse it. I miscommunicated here, should’ve explained myself, I got no excuses, maybe one: I sat on the shitter and wanted to make things short.

permalink

report

parent

reply

[ - ]

dep@lemmy.world

14 points

1 year ago

Feels like a publicity play

permalink

report

reply

[ - ]

Max_Power@feddit.de

36 points

1 year ago

*

I like her and I get why creatives are panicking because of all the AI hype.

However:

In evidence for the suit against OpenAI, the plaintiffs claim ChatGPT violates copyright law by producing a “derivative” version of copyrighted work when prompted to summarize the source.

A summary is not a copyright infringement. If there is a case for fair-use it’s a summary.

The comic’s suit questions if AI models can function without training themselves on protected works.

A language model does not need to be trained on the text it is supposed to summarize. She clearly does not know what she is talking about.

IANAL though.

permalink

report

reply

[ - ]

erogenouswarzone@lemmy.ml

-10 points

1 year ago

SS is such a tool. Does anybody remember the big anti-gay speech that launched her career in The Way of the Gun? She’ll do anything to get ahead.

Here’s the speech: https://www.youtube.com/watch?v=PAl5xGi7urQ

permalink

report

parent

reply

[ - ]

wick@lemmy.fmhy.ml

15 points

1 year ago

You hate her because of a part in a shitty movie?

permalink

report

parent

reply

[ - ]

erogenouswarzone@lemmy.ml

-7 points

1 year ago

Did I say hate? I said she’s a tool.

permalink

report

parent

reply

[ - ]

PipedLinkBot@feddit.rocksB

6 points

1 year ago

Here is an alternative Piped link(s): https://piped.video/watch?v=PAl5xGi7urQ

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source, check me out at GitHub.

permalink

report

parent

reply

[ - ]

another_kbin_addict@lemmy.world

1 point

1 year ago

Good piped bot

permalink

report

parent

reply

[ - ]

jmcs@discuss.tchncs.de

25 points

1 year ago

I guess they will get to analyze OpenAI’s dataset during discovery. I bet OpenAI didn’t have authorization to use even 1% of the content they used.

permalink

report

parent

reply

[ - ]

Jaded@lemmy.dbzer0.com

7 points

1 year ago

Things might change but right now, you simply don’t need anyones authorization.

Hopefully it doesn’t change because only a handful of companies have the data or the funds to buy the data, it would kill any kind of open source or low priced endeavour.

permalink

report

parent

reply

[ - ]

Flaky@iusearchlinux.fyi

4 points

1 year ago

FWIW, Common Crawl - a free/open-source dataset of crawled internet pages - was used by OpenAI for GPT-2 and GPT-3 as well as EleutherAI’s GPT-NeoX. Maybe on GPT3.5/ChatGPT as well but they’ve been hush about that.

permalink

report

parent

reply

[ - ]

maynarkh@feddit.nl

15 points

1 year ago

That’s why they don’t feel they can operate in the EU, as the EU will mandate AI companies to publish what datasets they trained their solutions on.

permalink

report

parent

reply

[ - ]

Margot Robbie@lemmy.world

20 points

1 year ago

She’s going to lose the lawsuit. It’s an open and shut case.

“Authors Guild, Inc. v. Google, Inc.” is the precedent case, in which the US Supreme Court established that transformative digitalization of copyrighted material inside a search engine constitutes as fair use, and text used for training LLMs are even more transformative than book digitalization since it is near impossible to reconstitute the original work barring extreme overtraining.

You will have to understand why styles can’t and should not be able to be copyrighted, because that would honestly be a horrifying prospect for art.

permalink

report

reply

[ - ]

patatahooligan@lemmy.world

10 points

1 year ago

“Transformative” in this context does not mean simply not identical to the source material. It has to serve a different purpose and to provide additional value that cannot be derived from the original.

The summary that they talk about in the article is a bad example for a lawsuit because it is indeed transformative. A summary provides a different sort of value than the original work. However if the same LLM writes a book based on the books used as training data, then it is definitely not an open and shut case whether this is transformative.

permalink

report

parent

reply

[ - ]

Margot Robbie@lemmy.world

4 points

1 year ago

But what an LLM does meets your listed definition of transformative as well, it indeed provides additional value that can’t be derive from the original, because everything it outputs is completely original but similar in style to the original that you can’t use to reconstitute the original work, in other words, similar to fan work, which is also why the current ML models, text2text or text2image, are called “transformers”. Again, works similar in style to the original cannot and should not be considered copyright infringement, because that’s a can of worm nobody actually wants to open, and the courts has been very consistent on that.

So, I would find it hard to believe that if there is a Supreme Court ruling which finds digitalizing copyrighted material in a database is fair use and not derivative work, that they wouldn’t consider digitalizing copyrighted material in a database with very lossy compression (that’s a more accurate description of what LLMs are, please give this a read if you have time) fair use as well. Of course, with the current Roberts court, there is always the chance that weird things can happen, but I would be VERY surprised.

There is also the previous ruling that raw transformer output cannot be copyrighted, but that’s beyond the scope of this post for now.

My problem with LLM outputs is mostly that they are just bad writing, and I’ve been pretty critical against “”“Open”""AI elsewhere on Lemmy, but I don’t see Siverman’s case going anywhere.

permalink

report

parent

reply

[ - ]

patatahooligan@lemmy.world

0 points

1 year ago

But what an LLM does meets your listed definition of transformative as well

No it doesn’t. Sometimes the output is used in completely different ways but sometimes it is a direct substitute. The most obvious example is when it is writing code that the user intends to incorporate into their work. The output is not transformative by this definition as it serves the same purpose as the original works and adds no new value, except stripping away the copyright of course.

everything it outputs is completely original

[citation needed]

that you can’t use to reconstitute the original work

Who cares? That has never been the basis for copyright infringement. For example, as far as I know I can’t make and sell a doll that looks like Mickey Mouse from Steamboat Willie. It should be considered transformative work. A doll has nothing to do with the cartoon. It provides a completely different sort of value. It is not even close to being a direct copy or able to reconstitute the original. And yet, as far as I know I am not allowed to do it, and even if I am, I won’t risk going to court against Disney to find out. The fear alone has made sure that we mere mortals cannot copy and transform even the smallest parts of copyrighted works owned by big companies.

I would find it hard to believe that if there is a Supreme Court ruling which finds digitalizing copyrighted material in a database is fair use and not derivative work

Which case are you citing? Context matters. LLMs aren’t just a database. They are also a frontend to extract the data from these databases, that is being heavily marketed and sold to people who might otherwise have bought the original works instead.

The lossy compression is also irrelevant, otherwise literally every pirated movie/series release would be legal. How lossy is it even? How would you measure it? I’ve seen github copilot spit out verbatim copies of code. I’m pretty sure that if I ask ChatGPT to recite me a very well known poem it will also be a verbatim copy. So there are at least some works that are included completely losslessly. Which ones? No one knows and that’s a big problem.

report

reply

[ - ]

-4 points

1 year ago

Lmao all these lawsuits smell like toilet paper to me; and probably another attack on AI to slow it down

permalink

report

reply

[ - ]

Numuruzero@lemmy.dbzer0.com

9 points

1 year ago

Honestly, a lot of them bring up necessary questions. AI being developed so quickly means a lot of questions got pushed off until later.

permalink

report

parent

reply

[ - ]

damnYouSun@sh.itjust.works

-1 points

1 year ago

Absolutely, but this one’s especially stupid.

It’s like claiming that I am guilty of copyright violation because I read their book. If I regurgitated word for word their novel, for free, to anyone that asked for it, than yeah that would be copyright violation. However I sincerely down that is what’s actually happening here.

permalink

report

parent

reply

Sarah Silverman Sues Maker Of ChatGPT For Copyright Infringement(www.huffpost.com)

Technology

!technology@lemmy.ml

Community stats

Community moderators