Comedian and author Sarah Silverman, as well as authors Christopher Golden and Richard Kadrey — are suing OpenAI and Meta each in a US District Court over dual claims of copyright infringement.

You are viewing a single thread.
View all comments View context
1 point

But the server used to calculate the model would have a copy of it. If training an AI model is not fair use then the mere act of loading a book you don’t have a license for into the server would be copyright infringement. Like text book. It’s a unauthorized digital copy. It’s all very untested legal grounds and seems like lots of people want to be the first to test it. Not everyone has a great case but if the courts interpret things a certain way there’s gonna be lots of payouts so maybe best to get in line early?

permalink
report
parent
reply
5 points
*

Perhaps, but that’s a separate legal issue from the model itself. You might have committed a breach of copyright in the process of gathering the material that the AI was trained on but the model itself is not a copy of that material and so is not itself illegal to train or use. And perhaps not even that, since downloading a pirated book is not the illegal part (uploading it is).

As you say, there’s some untested legal waters here. But it seems likely to me that the best that Silverman will accomplish is some nibbling and quibbling around the edges.

permalink
report
parent
reply
2 points
*

If you can give some vague prompts to the model to obtain something that is close enough to a significant chunk of the work that, had it been written by a human, was susceptible of being considered plagiarism… then I’d say the same laws protecting from plagiarism should operate there.

It doesn’t matter whether it’s really stored there in some form or not (in fact, it’s probably ok for to store copyrighted material in a private server as long as it’s lawfully obtained), but whether the output that is being distributed to third parties is violating the license of the work or not.

permalink
report
parent
reply
1 point

If you can give some vague prompts to the model to obtain something that is close enough to a significant chunk of the work that, had it been written by a human, was susceptible of being considered plagiarism… then I’d say the same laws protecting from plagiarism should operate there.

Perhaps, but that’s not even remotely what’s being accused in this case. They’re asking ChatGPT for a summary of the book and it’s generating a summary a couple of pages long. Nothing is even close to verbatim, and I don’t know enough about any of the books to know if those summaries are even accurate. In my experience ChatGPT often ends up hallucinating a lot of details when asked stuff like this.

permalink
report
parent
reply
1 point

Right but you can sue for what happened on the training server. I’m guessing the training server still exists. I doubt they wiped it completely before the next round of training. If the training server infringes copyright then you still lose the suit. Maybe. Remember that copyright law is not written with the internet in mind. If you have a “copy” and it’s not authorized that might just be enough for a backwards court to find infringement.

I think of it in extremes. Imagine you had a video producing model of the future. Could you then load up every MLB game recorded and train the model to make novel baseball games based on that or would the MLB be pissed you had a server full of every MLB game ever recorded?

permalink
report
parent
reply

Technology

!tech@kbin.social

Create post

This magazine is dedicated to discussions on the latest developments, trends, and innovations in the world of technology. Whether you are a tech enthusiast, a developer, or simply curious about the latest gadgets and software, this is the place for you. Here you can share your knowledge, ask questions, and engage in discussions on topics such as artificial intelligence, robotics, cloud computing, cybersecurity, and more. From the impact of technology on society to the ethical considerations of new technologies, this category covers a wide range of topics related to technology. Join the conversation and let’s explore the ever-evolving world of technology together!

Community stats

  • 7

    Monthly active users

  • 1.4K

    Posts

  • 8.5K

    Comments

Community moderators