OpenAI now tries to hide that ChatGPT was trained on copyrighted books, including J.K. Rowling’s Harry Potter series::A new research paper laid out ways in which AI developers should try and avoid showing LLMs have been trained on copyrighted material.
AI and your brain are very different things
How do you know that guy isn’t an AI?
You joke but AI advocates seem to forget that people have fundamentally different rights than tools and objects. A photocopier doesn’t get the right to “memorize” and “learn” from a text that a human being does. As much as people may argue that AIs work different, AIs are still not people.
And if they ever become people, the situation will be much more complicated than whether they can imitate some writer. But we aren’t there yet, even their advocates just uses them as tools.
You should read this article by Kit Walsh, who’s a senior staff attorney at the EFF too. The EFF is a digital rights group who most recently won a historic case: border guards now need a warrant to search your phone.
How do you see that as a difference? Tools are extensions of ourselves.
Restricting the use of LLMs is only restricting people.
Exactly. If I write some Loony toons fan fiction, Warner doesn’t own that. This ridiculous view of copyright (that’s not being challenged in the public discourse) needs to be confronted.
They can own it, actually. If you use the characters of Bugs Bunny, etc., or the setting (do they have a canonical setting?) then Warner does own the rights to the material you’re using.
For example, see how the original Winnie the Pooh material just entered public domain, but the subsequent Disney versions have not. You can use the original stuff (see the recent horror movie for an example of legal use) but not the later material like Tigger or Pooh in a red shirt.
Now if your work is satire or parody, then you can argue that it’s fair use. But generally, most companies don’t care about fan fiction because it doesn’t compete with their sales. If you publish your Harry Potter fan fiction on Livejournal, it wouldn’t be worth the money to pay the lawyers to take it down. But if you publish your Larry Cotter and the Wizard’s Rock story on Amazon, they’ll take it down because now it’s a competing product.
I think its more like writing a loony toons fanfic based only on pirated material
It’s honestly a good question. It’s perfectly legal for you to memorize a copyrighted work. In some contexts, you can recite it, too (particularly the perilous fair use). And even if you don’t recite a copyrighted work directly, you are most certainly allowed to learn to write from reading copyrighted books, then try to come up with your own writing based off what you’ve read. You’ll probably try your best to avoid copying anyone, but you might still make mistakes, simply by forgetting that some idea isn’t your own.
But can AI? If we want to view AI as basically an artificial brain, then shouldn’t it be able to do what humans can do? Though at the same time, it’s not actually a brain nor is it a human. Humans are pretty limited in what they can remember, whereas an AI could be virtually boundless.
If we’re looking at intent, the AI companies certainly aren’t trying to recreate copyrighted works. They’ve actively tried to stop it as we can see. And LLMs don’t directly store the copyrighted works, either. They’re basically just storing super hard to understand sets of weights, which are a challenge even for experienced researchers to explain. They’re not denying that they read copyrighted works (like all of us do), but arguably they aren’t trying to write copyrighted works.
No, because you paid for a single viewing of that content with your cinema ticket. And frankly, I think that the price of a cinema ticket (= a single viewing, which it was) should be what OpenAI should be made to pay.