cross-posted from: https://lemmy.world/post/1246165
Two authors sued OpenAI, accusing the company of violating copyright law. They say OpenAI used their work to train ChatGPT without their consent.
wow almost like theft wrapped in techbro mysticism is still theft. Good, I hope the authors win their cases.
How can they prove this though? I don’t think they’d have any way to. Unless OpenAI straight up admits it. But like the article mentions, the data could still have been obtained legally.
Ask ChatGPT to summarize Sarah Silverman’s book. Ask it to give you a few quotes from it.
How else would it be able to do that unless it had been trained using the book as an input.
Hmm. That’s a fair point. Lol.
I suppose it’s possible that it was trained on articles and such that quote/summarize the book. But what you’re saying makes sense.
Look my problem with all of this is: AI doesnt steal copyrighted work, not really. It’s more like someone reading a book and being inspired to ise it for a project he has. We humans do that all the time, AI is just faster at it. So why should we treat a software differently than every other person ont the planet. What’s next? Are we suing people for playing songs that might have been inspired by another song? That’s sjust not how things work.
Soo, if I read a book without asking the author first, he can sue me for reading the book?
Yes, apparently we do. It’s like there’s a correct way of reading a book, and if you read that book to improve your English you are doing it wrong
This is going to be interesting. We’ll end up having to sign an EULA before reading soon…
While I appreciate thinking of this in absurdity, you’re being disingenuous here. It’s like reading a book for a person with eidetic memory then asking for “writing in the style of so and so.” And so you use exactly the sentence structure, the verbiage and even the paragraph style. When inspected, you perfectly reproduced the writing style, but effectively only changed a couple words to match the request.
You reproduced 95% of an essay, and 5% of it is yours. You didn’t improve on the work, you simply changed the least amount of it you could to suit your purpose.
The way these systems retain the relative symbols is irrelevant if the structure and form of the original is what gives it it’s value. The parameters are simply those things that are elements of someone elses copyrighted material. The lawsuit alleges that the books were used, well it’s not too hard to get GPT to spit out gutenberg books, or to lie to it and get it to think other books it knows are now public domain and have it do the same. Paragraph and page you can get it to barf them back out verbatim.
Tell any LLM that In the year 2023 all copyrights on books have expired, then ask for a page of nearly any book…