Thousands of authors demand payment from AI companies for use of copyrighted works::Thousands of published authors are requesting payment from tech companies for the use of their copyrighted works in training artificial intelligence tools, marking the latest intellectual property critique to target AI development.

You are viewing a single thread.
View all comments
47 points

How can they prove that not some abstract public data has been used to train algorithms, but their particular intellectual property?

permalink
report
reply
73 points

Well, if you ask e.g. ChatGPT for the lyrics to a song or page after page of a book, and it spits them out 1:1 correct, you could assume that it must have had access to the original.

permalink
report
parent
reply
30 points

Or at least excerpts from it. But even then, it’s one thing for a person to put up a quote from their favourite book on their blog, and a completely different thing for a private company to use that data to train a model, and then sell it.

permalink
report
parent
reply
21 points
*
Deleted by creator
permalink
report
parent
reply
9 points

Can it recreate anything 1:1? When both my wife and I tried to get them to do that they would refuse, and if pushed they would fail horribly.

permalink
report
parent
reply
10 points

This is what I got. Looks pretty 1:1 for me.

permalink
report
parent
reply
6 points

you could assume that it must have had access to the original.

I don’t know if that’s true. If Google grabs that book from a pirate site. Then publishes the work as search results. ChatGPT grabs the work from Google results and cobbles it back together as the original.

Who’s at fault?

I don’t think it’s a straight forward ChatGPT can reproduce the work therefore it stole it.

permalink
report
parent
reply
22 points
*
Deleted by creator
permalink
report
parent
reply
9 points

Copyright doesn’t work like that. Say I sell you the rights to Thriller by Michael Jackson. You might not know that I don’t have the rights. But even if you bought the rights from me, whoever actually has the rights is totally in their legal right to sue you, because you never actually purchased any rights.

So if ChatGPT ripps it off Google who ripped it off a pirate site, then everyone in that chain who reproduced copyrighted works without permission from the copyright owners is liable for the damages caused by their unpermitted reproduction.

It’s literally the same as downloading something from a pirate site doesn’t make it legal, just because someone ripped it before you.

permalink
report
parent
reply
13 points

there are a lot of possible ways to audit an AI for copyrighted works, several of which have been proposed in the comments here, but what this could lead to is laws requiring an accounting log of all material that has been used to train an AI as well as all copyrights and compensation, etc.

permalink
report
parent
reply
9 points

Not without some seriously invasive warrants! Ones that will never be granted for an intellectual property case.

Intellectual property is an outdated concept. It used to exist so wealthier outfits couldn’t copy your work at scale and muscle you out of an industry you were championing.

It simply does not work the way it was intended. As technology spreads, the barrier for entry into most industries wherein intellectual property is important has been all but demolished.

i.e. 50 years ago: your song that your band performed is great. I have a recording studio and am gonna steal it muahahaha.

Today: “anyone have an audio interface I can borrow so my band can record, mix, master, and release this track?”

Intellectual property ignores the fact that, idk, Issac Newton and Gottfried Wilhelm Leibniz both independently invented calculus at the same time on opposite ends of a disconnected globe. That is to say, intellectual property doesn’t exist.

Ever opened a post to make a witty comment to find someone else already made the same witty comment? Yeah. It’s like that.

permalink
report
parent
reply
14 points
*

Spoken by someone who has never had something you’ve worked years on, be stolen.

permalink
report
parent
reply
2 points

What was “stolen” from you and how?

permalink
report
parent
reply
0 points

Spoken like someone who is having trouble admitting they’re standing on the shoulders of Giants.

I don’t expect a nuanced response from you, nor will I waste time with folks who can’t be bothered to respond in any form beyond attack, nor do I expect you to watch this

Intellectual property died with the advent of the internet. It’s now just a way for the wealthy to remain wealthy.

permalink
report
parent
reply
-2 points
Deleted by creator
permalink
report
parent
reply
5 points

Personally speaking, I’ve generated some stupid images like different cities covered in baked beans and have had crude watermarks generate with them where they were decipherable enough that I could find some of the source images used to train the ai. When it comes to photo realistic image generation, if all the ai does is mildly tweak the watermark then it’s not too hard to trace back.

permalink
report
parent
reply
11 points

All but a very small few generative AI programs use completely destructive methods to create their models. There is no way to recover the training images outside of infantesimally small random chance.

What you are seeing is the AI recognising that images of the sort you are asking for generally include watermarks, and creating one of its own.

permalink
report
parent
reply
4 points

Do you have examples? It should only happen in case of overfitting, i.e. too many identical image for the same subject

permalink
report
parent
reply
1 point

Here’s one I generated and an image from the photographer. Prompt was Charleston SC covered in baked beans lol

permalink
report
parent
reply
4 points

I’d think that given the nature of the language models and how the whole AI thing tends to work, an author can pluck a unique sentence from one of their works, ask AI to write something about that, and if AI somehow ‘magically’ writes out an entire paragraph or even chapter of the author’s original work, well tada, AI ripped them off.

permalink
report
parent
reply
2 points

I think that to protect creators they either need to be transparent about all content used to train the AI (highly unlikely) or have a disclaimer of liability, wherein if original content has been used is training of AI then the Original Content creator who have standing for legal action.

The only other alternative would be to insure that the AI specifically avoid copyright or trademarked content going back to a certain date.

permalink
report
parent
reply
2 points

Why a certain date? That feels arbitrary

permalink
report
parent
reply
1 point

At a certain age some media becomes public domain

permalink
report
parent
reply
1 point

They can’t. All they could prove is that their work is part of a dataset that still exists.

permalink
report
parent
reply

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


Community stats

  • 18K

    Monthly active users

  • 11K

    Posts

  • 518K

    Comments