lemm.ee

Local All Communities Log in Sign up

Local All Communities

867

Thousands of authors demand payment from AI companies for use of copyrighted works(www.cnn.com)

posted 1 year ago

by

L4sBot@lemmy.worldMB

in

technology@lemmy.world

Thousands of authors demand payment from AI companies for use of copyrighted works::Thousands of published authors are requesting payment from tech companies for the use of their copyrighted works in training artificial intelligence tools, marking the latest intellectual property critique to target AI development.

Sort:

Hot Top Controversial New Old

You are viewing a single thread.

View all comments

[ +- ]

cerevant@lemmy.world

44 points

1 year ago

There is already a business model for compensating authors: it is called buying the book. If the AI trainers are pirating books, then yeah - sue them.

There are plagiarism and copyright laws to protect the output of these tools: if the output is infringing, then sue them. However, if the output of an AI would not be considered infringing for a human, then it isn’t infringement.

When you sell a book, you don’t get to control how that book is used. You can’t tell me that I can’t quote your book (within fair use restrictions). You can’t tell me that I can’t refer to your book in a blog post. You can’t dictate who may and may not read a book. You can’t tell me that I can’t give a book to a friend. Or an enemy. Or an anarchist.

Folks, this isn’t a new problem, and it doesn’t need new laws.

report

reply

[ +- ]

scarabic@lemmy.world

30 points

1 year ago

When you sell a book, you don’t get to control how that book is used.

This is demonstrably wrong. You cannot buy a book, and then go use it to print your own copies for sale. You cannot use it as a script for a commercial movie. You cannot go publish a sequel to it.

Now please just try to tell me that AI training is specifically covered by fair use and satire case law. Spoiler: you can’t.

This is a novel (pun intended) problem space and deserves to be discussed and decided, like everything else. So yeah, your cavalier dismissal is cavalierly dismissed.

report

reply

[ +- ]

Zormat@lemmy.blahaj.zone

10 points

1 year ago

I completely fail to see how it wouldn’t be considered transformative work

report

reply

[ +- ]

scarabic@lemmy.world

9 points

1 year ago

It fails the transcendence criterion.Transformative works go beyond the original purpose of their source material to produce a whole new category of thing or benefit that would otherwise not be available.

Taking 1000 fan paintings of Sauron and using them in combination to create 1 new painting of Sauron in no way transcends the original purpose of the source material. The AI painting of Sauron isn’t some new and different thing. It’s an entirely mechanical iteration on its input material. In fact the derived work competes directly with the source material which should show that it’s not transcendent.

We can disagree on this and still agree that it’s debatable and should be decided in court. The person above that I’m responding to just wants to say “bah!” and dismiss the whole thing. If we can litigate the issue right here, a bar I believe this thread has already met, then judges and lawmakers should litigate it in our institutions. After all the potential scale of this far reaching issue is enormous. I think it’s incredibly irresponsible to say feh nothing new here move on.

report

reply

[ +- ]

HumbertTetere@feddit.de

3 points

1 year ago

I do think you have a point here, but I don’t agree with the example. If a fan creates the 1001 fan painting after looking at others, that might be quite similar if they miss the artistic quality to express their unique views. And it also competes with their source, yet it’s generally accepted.

report

reply

[ +- ]

Phlogiston@lemmy.world

3 points

1 year ago

Being able to dialog with a book, even to the point of asking the AI to “take on the persona of a character in the book” and support ongoing is substantively a transcendent version of the original. That one can, as a small subset of that transformed version, get quotes from the original work feels like a small part of this new work.

If this had been released for a single work. Like, “here is a star wars AI that can take on the persona of star wars characters” and answer questions about the star wars universe etc. I think its more likely that the position I’m taking here would lose the debate. But this is transformative against the entire set of prior material from books, movies, film, debate, art, science, philosophy etc. It merges and combines all of that. I think the sheer scope of this new thing supports the idea that its truly transformative.

A possible compromise would be to tax AI and use the proceeds to fund a UBI initiative. True, we’d get to argue if high profile authors with IP that catches the public’s attention should get more than just blogger or a random online contributor – but the basic path is that AI is trained on and succeeds by standing on the shoulders of all people. So all people should get some benefits.

report

reply

Show more comments

[ +- ]

jecxjo@midwest.social

8 points

1 year ago

Typically the argument has been “a robot can’t make transformative works because it’s a robot.” People think our brains are special when in reality they are just really lossy.

report

reply

[ +- ]

Zormat@lemmy.blahaj.zone

6 points

1 year ago

Even if you buy that premise, the output of the robot is only superficially similar to the work it was trained on, so no copyright infringement there, and the training process itself is done by humans, and it takes some tortured logic to deny the technology’s transformative nature

report

reply

[ +- ]

Square Singer@feddit.de

0 points

1 year ago

Go ask ChatGPT for the lyrics of a song and then tell me, that’s transformative work when it outputs the exact lyrics.

report

reply

[ +- ]

player2@lemmy.world

3 points

1 year ago

*

Well, they’re fixing that now. I just asked chatgpt to tell me the lyrics to stairway to heaven and it replied with a brief description of who wrote it and when, then said here are the lyrics: It stopped 3 words into the lyrics.

In theory as long as it isn’t outputting the exact copyrighted material, then all output should be fair use. The fact that it has knowledge of the entire copyrighted material isn’t that different from a human having read it, assuming it was read legally.

report

reply

Show more comments

Show more comments

[ +- ]

jecxjo@midwest.social

4 points

1 year ago

Go ask a human for the lyrics of a song and then tell me that’s transformative work.

Oh wait, no one would say that. This is why the discussion with non-technical people goes into the weeds.

report

reply

Show more comments

Show more comments

Show more comments

[ +- ]

jecxjo@midwest.social

1 point

1 year ago

Oh i think those people are wrong, but we tend to get laws based on people who don’t understand a topic deciding how it should work.

report

reply

Show more comments

Show more comments

[ +- ]

Hildegarde@lemmy.world

4 points

1 year ago

Transformativeness is only one of the four fair use factors. Just because something is transformative can’t alone make something fair use.

Even if AI is transformative, it would likely fail on the third factor. Fair use requires you to take the minimum amount of the copyrighted work, and AI companies scrape as much data as possible to train their models. Very unlikely to support a finding of fair use.

The final factor is market impact. As generative AIs are built to mimic the creativite outputs of human authorship. By design AI acts as a market replacement for human authorship so it would likely fail on this factor as well.

Regardless, trained AI models are unlikely to be copyrightable. Copyrights require human authorship which is why AI and animal generated art are not copyrightable.

A trained AI model is a piece of software so it should be protectable by patents because it is functional rather than expressive. But a patent requires you to describe how it works, so you can’t do that with AI. And a trained AI model is self-generated from training data, so there’s no human authorship even if trained AI models were copyrightable.

The exact laws that do apply to AI models is unclear. And it will likely be determined by court cases.

report

reply

[ +- ]

cerevant@lemmy.world

7 points

1 year ago

No, you misunderstand. Yes, they can control how the content in the book is used - that’s what copyright is. But they can’t control what I do with the book - I can read it, I can burn it, I can memorize it, I can throw it up on my roof.

My argument is that the is nothing wrong with training an AI with a book - that’s input for the AI, and that is indistinguishable from a human reading it.

Now what the AI does with the content - if it plagiarizes, violates fair use, plagiarizes- that’s a problem, but those problems are already covered by copyright laws. They have no more business saying what can or cannot be input into an AI than they can restrict what I can read (and learn from). They can absolutely enforce their copyright on the output of the AI just like they can if I print copies of their book.

My objection is strictly on the input side, and the output is already restricted.

report

reply

[ +- ]

Redtitwhore@lemmy.world

4 points

1 year ago

Makes sense. I would love to hear how anyone can disagree with this. Just because an AI learned or trained from a book doesn’t automatically mean it violated any copyrights.

report

reply

[ +- ]

cerevant@lemmy.world

2 points

1 year ago

*

The base assumption of those with that argument is that an AI is incapable of being original, so it is “stealing” anything it is trained on. The problem with that logic is that’s exactly how humans work - everything they say or do is derivative from their experiences. We combine pieces of information from different sources, and connect them in a way that is original - at least from our perspective. And not surprisingly, that’s what we’ve programmed AI to do.

Yes, AI can produce copyright violations. They should be programmed not to. They should cite their sources when appropriate. AI needs to “learn” the same lessons we learned about not copy-pasting Wikipedia into a term paper.

report

reply

Show more comments

[ +- ]

lily33@lemmy.world

2 points

1 year ago

*

It’s specifically distribution of the work or derivatives that copyright prevents.

So you could make an argument that an LLM that’s memorized the book and can reproduce (parts of) it upon request is infringing. But one that’s merely trained on the book, but hasn’t memorized it, should be fine.

report

reply

[ +- ]

scarabic@lemmy.world

-1 points

1 year ago

But by their very nature the LLM simply redistribute the material they’ve been trained on. They may disguise it assiduously, but there is no person at the center of the thing adding creative stokes. It’s copyrighted material in, copyrighted material out, so the plaintiffs allege.

report

reply

[ +- ]

lily33@lemmy.world

0 points

1 year ago

They don’t redistribute. They learn information about the material they’ve been trained on - not there natural itself*, and can use it to generate material they’ve never seen.

Bigger models seem to memorize some of the material and can infringe, but that’s not really the goal.

report

reply

Show more comments

[ +- ]

Dark Arc@lemmy.world

58 points

1 year ago

It’s 100% a new problem. There’s established precedent for things costing different amounts depending on their intended use.

For example, buying a consumer copy of song doesn’t give you the right to play that song in a stadium or a restaurant.

Training an entire AI to make potentially an infinite number of derived works from your work is 100% worthy of requiring a special agreement. This even goes beyond simple payment to consent; a climate expert might not want their work in an AI which might severely mischatacterize the conclusions, or might want to require that certain queries are regularly checked by a human, etc

report

reply

[ +- ]

cerevant@lemmy.world

0 points

1 year ago

My point is that the restrictions can’t go on the input, it has to go on the output - and we already have laws that govern such derivative works (or reuse / rebroadcast).

report

reply

[ +- ]

bouncing@partizle.com

0 points

1 year ago

The thing is, copyright isn’t really well-suited to the task, because copyright concerns itself with who gets to, well, make copies. Training an AI model isn’t really making a copy of that work. It’s transformative.

Should there be some kind of new model of renumeration for creators? Probably. But it should be a compulsory licensing model.

report

reply

[ +- ]

Fedizen@lemmy.world

-1 points

1 year ago

Challenge level impossible: try uploading something long to amazon written by chatgpt without triggering the plagiarism detector.

report

reply

[ +- ]

bouncing@partizle.com

3 points

1 year ago

https://www.reuters.com/technology/chatgpt-launches-boom-ai-written-e-books-amazon-2023-02-21/

report

reply

Show more comments

[ +- ]

jecxjo@midwest.social

6 points

1 year ago

The slippery slope here is that we are currently considering humans and computers to be different because (something someone needs to actually define). If you say “AI read my book and output a similar story, you owe me money” then how is that different from “Joe read my book and wrote a similar story, you owe me money.” We have laws already that deal with this but honestly how many books and movies aren’t just remakes of Romeo and Juliet or Taming of the Shrew?!?

report

reply

[ +- ]

Square Singer@feddit.de

1 point

1 year ago

Well, Shakespeare has beed dead for a few years now, there’s no copyright to speak of.

And if you make a book based on an existing one, then you totally need permission from the author. You can’t just e.g. make a Harry Potter 8.

But AIs are more than happy to do exacly that. Or to even reproduce copyrighted works 1:1, or only with a few mistakes.

report

reply

[ +- ]

jecxjo@midwest.social

1 point

1 year ago

It seems like people are afraid that AI can do it when i can do it too. But their reason for freaking out is…??? It’s not like AI is calling up publishers trying to get Harry Potter 8 published. If i ask it to create Harry Potter 1 but change his name to Gary Trotter it’s not the AI that is doing something bad, it’s me.

That was my point. I can memorize text and its only when I play it off as my own that it’s wrong. No one cares that I memorized the first chapter and can recite it if I’m not trying to steal it.

report

reply

[ +- ]

Square Singer@feddit.de

1 point

1 year ago

That’s not correct. The issue is not whether you play it off as your own, but how much the damages are that you can be sued for. If you recite something that you memorized in front of a handful of friends, the damages are non-existant and hence there is no point in sueing you.

But if you give a large commercial concert and perform a cover song without permission, you will get sued, no matter if you say “This song is from <insert original artist> and not from me”, because it’s not about giving credit, it’s about money.

And regarding getting something published: This is not so much about big name art like Harry Potter, but more about people doing smaller work. For example, voice actors (both for movie translations and smaller things like announcements in public transport) are now routinely replaced by AI that was trained on their own voices without their permission.

Similar story with e.g. people who write texts for homepages and ad material. Stuff like that. And that has real-world consequences already now.

report

reply

Show more comments

Show more comments

Show more comments

[ +- ]

Phlogiston@lemmy.world

3 points

1 year ago

If a person writes a fanfic harry potter 8 it isn’t a problem until they try to sell it or distribute it widely. I think where the legal issues get sticky here are who caused a particular AI generated Harry Potter 8 to be written.

If the AI model attempts to block this behavior. With contract stipulations and guardrails. And if it isn’t advertised as “a harry potter generator” but instead as a general purpose tool… then reasonably the legal liability might be on the user that decides to do this or not. Vs the tool that makes such behavior possible.

Hypothetically what if an AI was trained up that never read Harry Potter. But its pretty darn capable and I feed into it the entire Harry Potter novel(s) as context in my prompt and then ask it to generate an eighth story — is the tool at fault or am I?

report

reply

[ +- ]

Square Singer@feddit.de

2 points

1 year ago

Fanfic can actually be a legal problem. It’s usually not prosecuted, because it harms the brand to do so, but if a company was doing that professionally, they’d get into serious hot water.

Regarding your hypothetical scenario: If you train the AI with copyrighted works, so that you can make it reproduce HP8, then you are at fault.

If the tool was trained with HP books and you just ask really nicely to circumvent the protections, I would guess the tool (=> it’s creators) would certainly be at fault (since it did train on copyrighted material and the protections were obviously not good enough), and at the latest when you reproduce the output, you too are.

report

reply

Show more comments

Show more comments

[ +- ]

bouncing@partizle.com

2 points

1 year ago

If you say “AI read my book and output a similar story, you owe me money” then how is that different from “Joe read my book and wrote a similar story, you owe me money.”

You’re bounded by the limits of your flesh. AI is not. The $12 you spent buying a book at Barns & Noble was based on the economy of scarcity that your human abilities constrain you to.

It’s hard to say that the value proposition is the same for human vs AI.

report

reply

[ +- ]

jecxjo@midwest.social

2 points

1 year ago

We are making an assumption that humans do “human things”. If i wrote a derivative work of your $12 book, does it matter that the way i wrote it was to use a pen and paper and create a statistical analysis of your work and find the “next best word” until i had a story? Sure my book took 30 years to write but if i followed the same math as an AI would that matter?

report

reply

[ +- ]

bouncing@partizle.com

0 points

1 year ago

It wouldn’t matter, because derivative works require permission. But I don’t think anyone’s really made a compelling case that OpenAI is actually making directly derivative work.

The stronger argument is that LLM’s are making transformational work, which is normally fair use, but should still require some form of compensation given the scale of it.

report

reply

Show more comments

Show more comments

[ +- ]

BartsBigBugBag@lemmy.tf

1 point

1 year ago

It’s not even looking for the next best word. It’s looking for the next best token. It doesn’t know what words are. It reads tokens.

report

reply

Show more comments

Show more comments

Show more comments

Show more comments

Show more comments

[ +- ]

Avid Amoeba@lemmy.ca

4 points

1 year ago

Copyright also deals with derivative works.

report

reply

[ +- ]

bouncing@partizle.com

1 point

1 year ago

Derivative and transformative are quite different though.

report

reply

Show more comments

[ +- ]

bh11235@infosec.pub

2 points

1 year ago

*

Well, fine, and I can’t fault new published material having a “no AI” clause in its term of service. But that doesn’t mean we get to dream this clause into being retroactively for all the works ChatGPT was trained on. Even the most reasonable law in the world can’t be enforced on someone who broke it 6 months before it was legislated.

Fortunately the “horses out the barn” effect here is maybe not so bad. Imagine the FOMO and user frustration when ToS & legislation catch up and now ChatGPT has no access to the latest books, music, news, research, everything. Just stuff from before authors knew to include the “hands off” clause - basically like the knowledge cutoff, but forever. It’s untenable, OpenAI will be forced to cave and pay up.

report

reply

[ +- ]

CmdrShepard@lemmy.one

0 points

1 year ago

Even the most reasonable law in the world can’t be enforced on someone who broke it 6 months before it was legislated.

Sure it can. Just because it is a new law doesn’t mean they get to continue benefiting from IP ‘theft’ forever into the future.

Imagine the FOMO and user frustration when ToS & legislation catch up and now ChatGPT has no access to the latest books, music, news, research, everything. Just stuff from before authors knew to include the “hands off” clause

How is this an issue for the IP holders? Just because you build something cool or useful doesn’t mean you get a pass to do what you want.

basically like the knowledge cutoff, but forever. It’s untenable,

Untenable for ChatGPT maybe, but it’s not as if it’s the end of ‘knowledge’ or the end of AI. It’s just a single company product.

report

reply

[ +- ]

DandomRude@lemmy.world

11 points

1 year ago

OpenAI and such being forced to pay a share seems far from the worst scenario I can imagine. I think it would be much worse if artists, writers, scientists, open source developers and so on were forced to stop making their works freely available because they don’t want their creations to be used by others for commercial purposes. That could really mean that large parts of humanity would be cut off from knowledge.

I can well imagine copyleft gaining importance in this context. But this form of licencing seems pretty worthless to me if you don’t have the time or resources to sue for your rights - or even to deal with the various forms of licencing you need to know about to do so.

report

reply

[ +- ]

kklusz@lemmy.world

1 point

1 year ago

I think it would be much worse if artists, writers, scientists, open source developers and so on were forced to stop making their works freely available because they don’t want their creations to be used by others for commercial purposes.

None of them are forced to stop making their works freely available. If they want to voluntarily stop making their works freely available to prevent commercial interests from using them, that’s on them.

Besides, that’s not so bad to me. The rest of us who want to share with humanity will keep sharing with humanity. The worst case imo is that artists, writers, scientists, and open source developers cannot take full advantage of the latest advancements in tech to make more and better art, writing, science, and software. We cannot let humanity’s creative potential be held hostage by anyone.

That could really mean that large parts of humanity would be cut off from knowledge.

On the contrary, AI is making knowledge more accessible than ever before to large parts of humanity. The only comparible other technologies that have done this in recent times are the internet and search engines. Thank goodness the internet enables piracy that allows anyone to download troves of ebooks for free. I look forward to AI doing the same on an even greater scale.

report

reply

[ +- ]

Flying Squid@lemmy.world

7 points

1 year ago

Shouldn’t there be a way to freely share your works without having to expect an AI to train on them and then be able to spit them back out elsewhere without attribution?

report

reply

[ +- ]

kklusz@lemmy.world

1 point

1 year ago

No, there shouldn’t because that would imply restricting what I can do with the information I have access to. I am in favor of maintaining the sort of unrestricted general computing that we already have access to.

report

reply

Show more comments

[ +- ]

CmdrShepard@lemmy.one

3 points

1 year ago

The rest of us who want to share with humanity will keep sharing with humanity. The worst case imo is that artists, writers, scientists, and open source developers cannot take full advantage of the latest advancements in tech to make more and better art, writing, science, and software. We cannot let humanity’s creative potential be held hostage by anyone.

You’re not talking about sharing it with humanity, you’re talking about feeding it into an AI. How is this holding back the creative potential of humanity? Again, you’re talking about feeding and training a computer with this material.

report

reply

Show more comments

Show more comments

[ +- ]

assassin_aragorn@lemmy.world

9 points

1 year ago

However, if the output of an AI would not be considered infringing for a human, then it isn’t infringement.

It’s an algorithm that’s been trained on numerous pieces of media by a company looking to make money of it. I see no reason to give them a pass on fairly paying for that media.

You can see this if you reverse the comparison, and consider what a human would do to accomplish the task in a professional setting. That’s all an algorithm is. An execution of programmed tasks.

If I gave a worker a pirated link to several books and scientific papers in the field, and asked them to synthesize an overview/summary of what they read and publish it, I’d get my ass sued. I have to buy the books and the scientific papers. STEM companies regularly pay for access to papers and codes and standards. Why shouldn’t an AI have to do the same?

report

reply

[ +- ]

Saik0@lemmy.saik0.com

1 point

1 year ago

It’s an algorithm that’s been trained on numerous pieces of media by a company looking to make money of it.

If I read your book… and get an amazing idea… Turn it into a business and make billions off of it. You still have no right to anything. This is no different.

If I gave a worker a pirated link to several books and scientific papers in the field

There’s been no proof or evidence provided that ANY content was ever pirated. Has any of the companies even provided the dataset they’ve used yet?

Why is this the presumption that they did it the illegal way?

report

reply

[ +- ]

CmdrShepard@lemmy.one

3 points

1 year ago

If I read your book… and get an amazing idea… Turn it into a business and make billions off of it. You still have no right to anything. This is no different

I don’t see how this is even remotely the same? These companies are using this material to create their commercial product. They’re not consuming it personally and developing a random idea later, far removed from the book itself.

I can’t just buy (or pirate) a stack of Blu-rays and then go start my own Netflix, which is akin to what is happening here.

report

reply

[ +- ]

Saik0@lemmy.saik0.com

-1 points

1 year ago

They’re not consuming it personally and developing a random idea later, far removed from the book itself.

I never said that the idea would be removed from the book. You can literally take the idea from the book itself and make the money. There would be no issues. There is no dues owed to the book’s writer.

This is the whole premise for educational textbooks. You can explain to me how the whole world works in book form… I can go out and take those ideas wholesale from your book and apply them to my business and literally make money SOLELY from information from your book. There’s nothing due back to you as a writer from me nor my business.

report

reply

[ +- ]

CmdrShepard@lemmy.one

2 points

1 year ago

You’ve failed to explain how that relates to your point. Sure you can purchase an econonomics textbook and then go become a finance bro, but that’s not what they’re doing here. They’re taking that textbook (that wasn’t paid for) and feeding it into their commercial product. The end product is derived from the author’s work.

To put it a different way, would they still be able to produce ChatGPT if one of the developers simply read that same textbook and then inputted what they learned into the model? My guess is no.

It’d be the same if I went and bought CDs, ripped my favorite tracks, and then put them into a compilation album that I then sold for money. My product can’t exist without having copied the original artists work. ChatGPT just obfuscates that by copying a lot of songs.

report

reply

[ +- ]

Saik0@lemmy.saik0.com

0 points

1 year ago

They’re taking that textbook (that wasn’t paid for) and feeding it into their commercial product.

Nobody has provided any evidence that this is the case. Until this is proven it should not be assumed. Bandwagoning (and repeating this over and over again without any evidence or proof) against the ML people without evidence is not fair. The whole point of the Justice system is innocent until proven guilty.

The end product is derived from the author’s work.

Derivative works are 100% protected under copyright law. https://www.legalzoom.com/articles/what-are-derivative-works-under-copyright-law

This is the same premise that allows “fair use” that we all got up and arms about on youtube. Claiming that this doesn’t exist now in this case means that all that stuff we fought for on Youtube needs to be rolled back.

To put it a different way, would they still be able to produce ChatGPT if one of the developers simply read that same textbook and then inputted what they learned into the model? My guess is no.

Why not? Why can’t someone grab a book, scan it… chuck it into an OCR and get the same content? There are plenty of ways that snippets of raw content could make it into these repositories WITHOUT asserting legal problems.

It’d be the same if I went and bought CDs, ripped my favorite tracks, and then put them into a compilation album that I then sold for money.

No… You could have for all intents and purposes have recorded all your songs from the radio onto a cassette… That would be 100% legal for personal consumption… which would be what the ML authors are doing. ChatGPT and others could have sources information from published sources that are completely legit. No “Author” has provided any evidence otherwise yet to believe that ChatGPT and others have actually broken a law yet. For all we know the authors of these tools have library cards, and fed in screenshots of the digital scans of the book or hand scanned the book. Or didn’t even use the book at all and contextually grabbed a bunch of content from the internet at large.

Since the ML bots are all making derivative works, rather than spitting out original content… they’d be covered by copyright as a derivative work.

This only becomes an actual problem if you can prove that these tools have done BOTH

obtain content in an illegal fashion
provide the copyrighted content freely without fair-use or other protections.

report

reply

[ +- ]

bouncing@partizle.com

0 points

1 year ago

A better comparison would probably be sampling. Sampling is fair use in most of the world, though there are mixed judgments. I think most reasonable people would consider the output of ChatGPT to be transformative use, which is considered fair use.

report

reply

Show more comments

Show more comments

Show more comments

Show more comments

Show more comments

[ +- ]

bouncing@partizle.com

10 points

1 year ago

If I gave a worker a pirated link to several books and scientific papers in the field, and asked them to synthesize an overview/summary of what they read and publish it, I’d get my ass sued. I have to buy the books and the scientific papers.

Well, if OpenAI knowingly used pirated work, that’s one thing. It seems pretty unlikely and certainly hasn’t been proven anywhere.

Of course, they could have done so unknowingly. For example, if John C Pirate published the transcripts of every movie since 1980 on his website, and OpenAI merely crawled his website (in the same way Google does), it’s hard to make the case that they’re really at fault any more than Google would be.

report

reply

[ +- ]

assassin_aragorn@lemmy.world

1 point

1 year ago

Haven’t people asked it to reproduce specific chapters or pages of specific books and it’s gotten it right?

report

reply

[ +- ]

bouncing@partizle.com

1 point

1 year ago

I haven’t been able to reproduce that, and at least so far, I haven’t seen any very compelling screenshots of it that actually match. Usually it just generates text, but that text doesn’t actually match.

report

reply

[ +- ]

assassin_aragorn@lemmy.world

1 point

1 year ago

Gotcha. This seems like a good way to test for it then, I think.

report

reply

Show more comments

Show more comments

[ +- ]

cactusupyourbutt@lemmy.world

2 points

1 year ago

well no, because the summary is its own copyrighted work

report

reply

[ +- ]

Saik0@lemmy.saik0.com

1 point

1 year ago

Right, but not one the author of the book could go after. The article publisher would have the closest rights to a claim. But if I read the crib notes and a few reviews of a movie… Then go to summarize the movie myself… That’s derivative content and is protected under copyright.

report

reply

[ +- ]

bouncing@partizle.com

2 points

1 year ago

*

The published summary is open to fair use by web crawlers. That was settled in Perfect 10 v Amazon.

report

reply

Show more comments

[ +- ]

Cloudless ☼@feddit.uk

15 points

1 year ago

I asked Bing Chat for the 10th paragraph of the first Harry Potter book, and it gave me this:

“He couldn’t know that at this very moment, people meeting in secret all over the country were holding up their glasses and saying in hushed voices: ‘To Harry Potter – the boy who lived!’”

It looks like technically I might be able to obtain the entire book (eventually) by asking Bing the right questions?

report

reply

[ +- ]

cerevant@lemmy.world

4 points

1 year ago

*

Then this is a copyright violation - it violates any standard for such, and the AI should be altered to account for that.

What I’m seeing is people complaining about content being fed into AI, and I can’t see why that should be a problem (assuming it was legally acquired or publicly available). Only the output can be problematic.

report

reply

[ +- ]

GentlemanLoser@reddthat.com

5 points

1 year ago

No, the AI should be shut down and the owner should first be paying the statutory damages for each use of registered works of copyright (assuming all parties in the USA)

If they have a company left after that, then they can fix the AI.

report

reply

[ +- ]

cerevant@lemmy.world

8 points

1 year ago

Again, my point is that the output is what can violate the law, not the input. And we already have laws that govern fair use, rebroadcast, etc.

report

reply

Show more comments

[ +- ]

DandomRude@lemmy.world

4 points

1 year ago

I think it’s not just the output. I can buy an image on any stock Plattform, print it on a T-Shirt, wear it myself or gift it to somebody. But if I want to sell T-Shirts using that image I need a commercial licence - even if I alter the original image extensivly or combine it with other assets to create something new. It’s not exactly the same thing but openAI and other companies certainly use copyrighted material to create and improve commercial products. So this doesn’t seem the same kind of usage an avarage joe buys a book for.

report

reply

[ +- ]

bouncing@partizle.com

8 points

1 year ago

There is already a business model for compensating authors: it is called buying the book. If the AI trainers are pirating books, then yeah - sue them.

That’s part of the allegation, but it’s unsubstantiated. It isn’t entirely coherent.

report

reply

[ +- ]

Flying Squid@lemmy.world

3 points

1 year ago

It’s not entirely unsubstantiated. Sarah Silverman was able to get ChatGPT to regurgitate passages of her book back to her.

report

reply

[ +- ]

AnonStoleMyPants@sopuli.xyz

2 points

1 year ago

I don’t know if this holds water though. You don’t need to trail the AI on the book itself to get that result. Just on discussions about the book which for sure include passages on the book.

report

reply

[ +- ]

bouncing@partizle.com

3 points

1 year ago

Her lawsuit doesn’t say that. It says,

when ChatGPT is prompted, ChatGPT generates summaries of Plaintiffs’ copyrighted works—something only possible if ChatGPT was trained on Plaintiffs’ copyrighted works

That’s an absurd claim. ChatGPT has surely read hundreds, perhaps thousands of reviews of her book. It can summarize it just like I can summarize Othello, even though I’ve never seen the play.

report

reply

[ +- ]

volkhavaar@lemmy.world

16 points

1 year ago

This is a little off, when you quote a book you put the name of the book you’re quoting. When you refer to a book, you, um, refer to the book?

I think the gist of these authors complaints is that a sort of “technology laundered plagiarism” is occurring.

report

reply

[ +- ]

cerevant@lemmy.world

1 point

1 year ago

Copyright 100% applies to the output of an AI, and it is subject to all the rules of fair use and attribution that entails.

That is very different than saying that you can’t feed legally acquired content into an AI.

report

reply

Technology

!technology@lemmy.world

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

@L4s@lemmy.world
@autotldr@lemmings.world
@PipedLinkBot@feddit.rocks
@wikibot@lemmy.world

Community stats

18K
Monthly active users
12K
Posts
553K
Comments

Community moderators

L3s@lemmy.world
L3s@fry.gs
L4sBot@fry.gsB
L4sBot@lemmy.worldB
enu@lemmy.world

modlog legal instances join-lemmy.org

lemmy-ui-next v0.11.0 (github)lemmy v0.19.5 (github)