Data poisoning: how artists are sabotaging AI to take revenge on image generators::As AI developers indiscriminately suck up online content to train their models, artists are seeking ways to fight back.
This system runs on the assumption that A) massive generalized scraping is still required B) You maintain the metadata of the original image C) No transformation has occurred to the poisoned picture prior to training(Stable diffusion is 512x512). Nowhere in the linked paper did they say they had conditioned the poisoned data to conform to the data set. This appears to be a case of fighting the last war.
Takes image, applies antialiasing and resize
Oh, look at that, defeated by the completely normal process of preparing the image for training
Unfortunately for them there’s a lot of jobs dedicated to cleaning data so I’m not sure if this would even be effective. Plus there’s an overwhelming amount of data that isn’t “poisoned” so it would just get drowned out if never caught
Imagine if writers did the same things by writing gibberish.
At some point, it becomes pretty easy to devalue that content and create other systems to filter it.
I mean isn’t that eventually going to happen? Isn’t ai going to eventually learn and get trained from ai datasets and small issues will start to propagate exponentially?
I just assume we have a clean dataset preai and messy gross dataset post ai… If it keeps learning from the latter dataset it will just get worse and worse, no?
nightshade and glaze never worked. its scam lol