GRRM is worried AI will finish writing his books before him
Would you rather it was finished by 100 duck sized George RR Martins or a George RR Martin sized duck?
Get those 100 duck sized GRRM’s a bunch of mini keyboards and they’ll get it done 100 times faster
And then everyone would bitch because it wasn’t good, like what happend with the last seasons of GoT.
Another moment in A Dream of Spring involved Bran receiving a vision that The Wall was not just a physical barrier, but a mystical shield holding back the Night King’s power. “This twist fits well within the universe and raises tension for the remainder of the story,” Swayne remarks.
That’s just a popular fan theory that has been discussed countless times on various forums.
I guess we can conclude that ChatGPT has been reading a lot of reddit.
Actually getting a good ending to the ASOIAF after GRRM dies is gonna be one of the big turning points that transforms everyone’s opinion on AI.
It’s gonna be like fan edits for movies. People will debate which is the better version of the story. The only person hurt by this is George, who will be dead and was never going to finish the books anyways.
The authors added that OpenAI’s LLMs could result in derivative work “that is based on, mimics, summarizes, or paraphrases” their books, which could harm their market.
Ok, so why not wait until those hypothetical violations occur and then sue?
Because the outcome of suing first is to address the potential outcome of what could happen based on what OenAI is doing right now. Kind of like how safety regulations are intended to prevent future problems based on what has happened previously, but expanded similar potential dangers instead of waiting for each exact scenario to happen.
But if OpenAI cannot legally be inspired by your work, the implication is humans can’t either.
It’s not how copyright works. Transformative work is transformative.
The way I’ve heard it described: If I check out a home repair book and use that knowledge to do some handy-man work on the side, do I owe the publisher a cut of my profits?
How is that the implication?
Inspiration is something we do through conscious experience. Just because some statistical analysis of a word cloud can produce sentences that trick a casual observer into thinking a person wrote them doesn’t make it a creative process.
In fact, I can prove to you that (so-called) AI can never be creative.
To get an AI to do anything, we have to establish a goal to measure against. You have to quantify it.
If you tell a human being “this is what it means to be creative; we have an objective measure of it”, do you know what they tend to do? They say “fuck your definition” and try to make something that breaks the rules in an interesting way. That’s the entire history of art.
You can even see that playing out with respect to AI. Artists going “You say AI art can’t be art, so I’m gonna enter AI pieces and see if you can even tell.”
That’s a creative act. But it’s not creative because of what the AI is doing. Much like Duchamp’s urinal wasn’t a creative object, but the act of signing it R Mutt and submitting it to a show was.
The kinds of AIs we design right now will never have a transformative R Mutt moment, because they are fundamentally bounded by their training. They would have to be trained to use novel input to dismantle and question their training (and have that change stick around), but even that training would then become another method of imitation that they could not escape. They can’t question quantification itself, because they are just quantitative processes — nothing more than word calculators.
Generative AI training is not the same thing as human inspiration. And transformative work has this far has only been performed by a human. Not by a machine used by a human.
Clearly using a machine that simply duplicates a work to resell runs afoul of copyright.
What about using a machine that slightly rewords that work? Is that still allowed? Or a machine that does a fairly significant rewording? What if it sort of does a mashup of two works? Or a mashup of a dozen? Or of a thousand?
Under what conditions does it become ok to feed a machine with someone’s art (without their permission) and sell the resulting output?
The difference is that you’re trying to sue someone based on what could happen. That’s like sueing some random author because they read your book and could potentially make a story that would be a copy of it.
LLM’s are trained on writings in the language and understand how to structure sentences based on their training data. Do AI models plagiarize anymore than someone using their understanding of the English language is plagiarizing when they construct a brand new sentence? After all, we learn how to write by reading the language and learning the rules, is the training data we read when we were kids being infringed whenever we write about similar topics?
When someone uses AI to plagiarize you sue them into eternity for all I care, but no one seems concerned with the implications of trying to a sue someone/something because they trained an intelligence by letting it read publicly available written works. Reading and learning isn’t allowed because you could maybe one day possibly use that information to break copyright law.
I see this more like suing a musician for using a sample of your recording or a certain amount of notes or lyrics from your song without your consent. The musician created a new work but it was based on your previous songs. I’m sure if a publisher asked ChatGBT to produce a GRRM-like novel, it would create a plagiarism-lite mash up of his works that were used as writing samples, using pieces of his plots and characters, maybe even quoting directly. Sampling GRRM’s writing, in other words.
Suing anyone for copyright infringement based on current infringement always includes justification that includes current and future potential losses. You don’t get paid for the potential losses, but they are still justification for them to stop infringing right now.
Do AI models plagiarize anymore than someone using their understanding of the English language is plagiarizing when they construct a brand new sentence?
Yes
Safety regulations are created by regulatory agencies empowered by Congress, not private parties suing each other over hypotheticals.
It was a comparison about preventing future issues, not a literally equivalent legal situation.
Because that is far harder to prove than showing OpenAI used his IP without permission.
In my opinion, it should not be allowed to train a generative model on data without permission of the rights holder. So at the very least, OpenAI should publish (references to) the training data they used so far, and probably restrict the dataset to public domain–and opt-in works for future models.
I don’t see why they (authors/copyright holders) have any right to prevent use of their product beyond purchasing. If I legally own a copy of Game of Thrones, I should be able to do whatever the crap I want with it.
And basically, I can. I can quote parts of it, I can give it to a friend to read, I can rip out a page and tape it to the wall, I can teach my kid how to read with it.
Why should I not be allowed to train my AI with it? Why do you think it’s unethical?
And basically, I can. I can quote parts of it, I can give it to a friend to read, I can rip out a page and tape it to the wall, I can teach my kid how to read with it.
These are things you’re allowed to do with your copy of the book. But you are not allowed to, for example create a copy of it and give that to a friend, create a play or a movie out of it. You don’t own the story, you own a copy of it on a specific medium.
As to why it’s unethical, see my comment here.
Ownership is never absolute. Just like with music - you are not allowed to use it commercially i.e. in your restaurant, club, beauty salon, etc. without paying extra. You are also not allowed to do the same with books - for example, you shouldn’t share scans online, although it’s “your” book.
However, it is not clear how AI infringes on the rights of authors in this case. Because a human may read a book and produce a similar book in the same style legally.
Assuming that books used for GPT training were indeed purchased, not pirated, and since “AI training” was not prohibited at the time of the purchase, the engineers had every right to use them. Maybe authors in the future could prohibit “AI training” but for the books purchased before they do, “AI training” is a fair usage.
Okay, the problem is there are only about three companies with either enough data or enough money to buy it. Any open source or small time AI model is completely dead in the water. Since our economy is quickly moving towards being AI driven, it would basically guarantee our economy is completely owned by a handful of companies like Getty Images.
Any artist with less weight than GRR and Taylor Swift is still screwed, they might get a peanut or two at most.
I’d rather get an explosion of culture, even if it mean GRR doesn’t get a last fat paycheck and Hollywood loses control of its monopoly.
I get it. I download movies without paying for it too. It’s super convenient, and much cheaper than doing it the right thing.
But I don’t pretend it’s ethical. And I certainly don’t charge other people money to benefit from it.
Either there are plenty of people who are fine with their work being used for AI purposes (especially in a open source model), or they don’t agree to it - in which case it would be unethical to do so.
Just because something is practical, doesn’t mean it’s right.
We could get Elon musk to develop a corpus and train all AI on that instead of training AI on a corpus from scraping websites.
I mean this isn’t miles away from what the writer’s strike is about. Certainly I think the technology is great but after the last round of tech companies turning out to be fuckwits (Facebook, Google etc) it’s only natural that people are going to want to make sure this stuff is properly regulated and run fairly (not at the expense of human creatives).
As it stands now, I actually think it is miles away.
Studio’s were raking in huge profits from digital residuals that weren’t being passed to creatives, but AI models aren’t currently paying copyright holders anything. If they suddenly did start paying publishers for use, it would almost certainly exclude payment to the actual creatives.
I’d also point out that LLM’s aren’t like digital distribution models because LLM’s aren’t distributing copyrighted works, at best you can say they’re distributing a very lossy (to use someone else’s term) compressed alternative that has to be pieced back together manually if you really wanted to extract it.
No argument that AI should be properly regulated, but I don’t think copyright is the right framework to be doing it.
If the models trained on pirated works were available as a non-profit sort of setup with any commercial application being banned I think that would be fine.
Business owners salivating over the idea that they can just pocket the money writers and artists would make is not exactly a good use of tech.
Copyright law in general is out of date and needs updating it’s not just AI that’s the problem that’s just the biggest new thing. But content creators of traditional media have been railing against what they perceive as copyright violation for ages.
Look at Nintendo and Let’s Plays.
The law is the problem here. Not AI.
Copyright law has been such a disaster for so long, while clearly being wielded like a blunt weapon by corporations. I can see the existential threat that generative AI can pose to creators if it becomes good enough. And I also am aware that my dream of asking an AI to make a buddy cop adventure where Batman and Deadpool accidentally bust into the Disney universe, or remake the final season of Game of Thrones, is never gonna be allowed, but there’s honestly a huge amount of potential for people to get the entertainment they want.
At any rate, it seems likely that they’re going to try and neuter generative AI with restrictions, despite it really not being the issue at hand.