You are viewing a single thread.
View all comments
337 points

So when’s the ruling against OpenAI and the like using the same copyrighted material to train their models

permalink
report
reply
132 points

But OpenAI not being allowed to use the content for free means they are being prevented from making a profit, whereas the Internet Archive is giving away the stuff for free and taking away the right of the authors to profit. /s

Disclaimer: this is the argument that OpenAI is using currently, not my opinion.

permalink
report
parent
reply
82 points
*

Ah, I see you got that all wrong.

Open IA AI uses that content to generate billions in profit on the backs of The People. The Internet Archive just does it for the good of The People.

We can’t have that. “Good for The People” is not how the economy works, pal. We need profit and exploitation for the world to work…

permalink
report
parent
reply
23 points

OpenAI is burning billions of dollars not making profit.

permalink
report
parent
reply
4 points

Sounds like they are operating the same as all the other big tech companies then

permalink
report
parent
reply
0 points
4 points

“Good for the people”? You mean COMMUNISM?

permalink
report
parent
reply
3 points

I think you accidentally swapped OpenAI and Open IA which happens to initialize Internet Archive, a little confusing.

permalink
report
parent
reply
4 points

I didn’t even realise. Thank you for pointing it out, I fixed it.

permalink
report
parent
reply
21 points

Hot on the heels of this one, I’d imagine.

permalink
report
parent
reply
47 points

Fat chance. Line must go up.

permalink
report
parent
reply
18 points

So, let’s say we create an llm that will be fed will all the copyrighted data and we design it, so that it recalls the originals when asked?! Does that count as piracy or as the kind of legal shananigans openai is doing?

permalink
report
parent
reply
6 points

Aaaaaany minute now.

permalink
report
parent
reply
6 points

It’s two different things happening. One is redistribution, which isn’t allowed and the other is fair use, which is allowed. You can’t ban someone from writing a detailed synopsis of your book. That’s all an llm is doing. It’s no different than a human reading the material and then using that to write something similar.

permalink
report
parent
reply
16 points
*

the other is fair use

That’s very much up for debate still.

(I am personally still undecided)

permalink
report
parent
reply
4 points

The difference is that the llm has the ability to consume and remember all available information whereas a human would have difficulty remembering everything in detail. We still see humans unintentionally remaking things they’ve heard before. Comedians have unintentionally stolen jokes they’ve heard. Every songwriter has unintentionally “discovered” a catchy tune which is actually someone else’s. We have fanfiction and parody. Most people’s personalities are just an amalgamation of everyone and everything they’ve ever seen, not unlike an llm themselves.

permalink
report
parent
reply
3 points
*

I think that’s the difference right there.

One is up for debate, the other one is already heavily regulated currently. Libraries are generally required to have consent if they are making straight copies of copyrighted works. Whether we like it or not.

What AI does is not really a straight up copy, which is why it’s fuzzy, and much harder to regulate without stepping in our own toes, specially as tech advances and the difference between a human reading something and a machine doing it becomes harder and harder to detect.

permalink
report
parent
reply
3 points
*

The matter is not LLMs reproducing what they have learned, it is that they didn’t pay for the books they read, like people are supposed to do legally.

This is not about free use, this is about free access, which at the scale of an individual reading books is marketed as “piracy”…at the scale of reading all books known to man…it’s onmipiracy?

We need some kind of deal where commercial LLMs have to pay a rent to a fund that distributes that among creators or remain nonprofit, which is never gonnna happen, because it’ll be a bummer for all the grifters rushing into that industry.

permalink
report
parent
reply
1 point

I think we need to re-examine what copyright should be. There’s nothing inherently immoral about “piracy” when the original creator gets almost nothing for their work after the initial release.

permalink
report
parent
reply
1 point

it is that they didn’t pay for the books they read, like people are supposed to do legally.

If I can read a book from a library, why shouldn’t OpenAI or anybody else?

…but yes from what I’ve heard they (or whoever, don’t remember) actually trained on libgen. OpenAI can be scummy without the general process of feeding AI books you only have read access to being scummy.

permalink
report
parent
reply
-1 points

stop asking questions and go back to work

permalink
report
parent
reply

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


Community stats

  • 18K

    Monthly active users

  • 12K

    Posts

  • 539K

    Comments