The New York Times is suing OpenAI and Microsoft for copyright infringement, claiming the two companies built their AI models by “copying and using millions” of the publication’s articles and now “directly compete” with its content as a result.
As outlined in the lawsuit, the Times alleges OpenAI and Microsoft’s large language models (LLMs), which power ChatGPT and Copilot, “can generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style.” This “undermine[s] and damage[s]” the Times’ relationship with readers, the outlet alleges, while also depriving it of “subscription, licensing, advertising, and affiliate revenue.”
The complaint also argues that these AI models “threaten high-quality journalism” by hurting the ability of news outlets to protect and monetize content. “Through Microsoft’s Bing Chat (recently rebranded as “Copilot”) and OpenAI’s ChatGPT, Defendants seek to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment,” the lawsuit states.
The full text of the lawsuit can be found here
This entire comment screams of 0 technical knowledge.
The LLM does not contain the training data. It contains nothing but math it generates you an answer by calculations, in the end you get the awnser wich is statistically most likely what you want. Otherwise the fucking thing wouldn’t produce fake news and make shit up.
Shure if you want it to write you a very specific thing and you know exactly what to ask, you might get a small text that is “copyrighted” but thats because you asked for it, not because it’s inside. It just gives you the awnser you most likely find helpful, statistically.
Its like asking you to read a page very well and then asking you the next day to write down what was on the page, while giving you lots of hints. You didn’t actually copy from it in that case.
It’s all black magic to me, so if you have resources on this, that would be great. My initial thought is that it would have surely have a data source to reference to? Your last example is some one referring to their memory of something and recreating it. By referring to that memory, that is in essence a reference back to the original data that someone has remembered?
Its like asking you to read a page very well and then asking you the next day to write down what was on the page, while giving you lots of hints. You didn’t actually copy from it in that case.
My guy, if you compellingly re-wrote Harry Potter from memory and charged people for access to your work, you can definitely expect J.K. Rowling to sue you.