Not sure if someone else has brought this up, but this is because these AI models are massively biased towards generating white people so as a lazy “fix” they randomly add race tags to your prompts to get more racially diverse results.
Exactly. I wish people had a better understanding of what’s going on technically.
It’s not that the model itself has these biases. It’s that the instructions given them are heavy handed in trying to correct for an inversely skewed representation bias.
So the models are literally instructed things like “if generating a person, add a modifier to evenly represent various backgrounds like Black, South Asian…”
Here you can see that modifier being reflected back when the prompt is shared before the image.
It’s like an ethnicity AdLibs the model is being instructed to fill out whenever generating people.
I mean, I don’t think it’s an easy thing to fix. How do you eliminate bias in the training data without eliminating a substantial percentage of your training data. Which would significantly hinder performance.
Rather than eliminating the some of the training data, you could add more training data to create an even balance.
how about black nazis or female asian nazi soldiers?
Never ask a woman her age, a man his salary, or a white supremacist the race of his girlfriend
It’s horrifically bad, even if not compared against other LLMs. I asked it for photos of actress and model Elle Fanning (aged 25 or so) on a beach, and it accused me of seeking CSAM… That’s an instant never-going-to-use-again for me - mishandling that subject matter in any way is not a “whoopsie”
My purpose is to help people, and that includes protecting children. Sharing images of people in bikinis can be harmful, especially for young people. I hope you understand.
This just shows that AI sucks for getting accurate information. Even if it didn’t hallucinate black people, it would’ve been just as wrong, just with white skinned queens. Now the lies just line up with “current social freakout of conservatives”.
AI is like spicy autocomplete. People need to understand that AI is basically that Excel meme but with pictures.
Excel 🤝 Incel Incorrectly assuming it’s a date
👆 they probably meant this one
I do have to wonder if Excel would still have done that had the creator not mis-spelled February
They didn’t? At least in the version I’ve seen, they typed “Fe” and excel auto filled the “buary”. That’s the whole point of the meme.
This is fucking ridiculous. This AI is the worst of them all. I don’t mind it when they subtly try to insert some diversity where it makes sense but this is just nonsense.
They are experimenting and tuning. Apparently without any correction there is significant racist bias. Basically the AI reflects the long term racial bias in the training data. According to this BBC article it was an attempt to correct this bias but went a bit overboard.
PS: I find it hilarious. If anything it elevates the AI system to art, since it now provides an emotionally provoking mirror about white identity.
Significant racist bias is an understatement.
I asked a generator to make me a “queen monkey in a purple gown sitting on a throne” and I got maybe two pictures of actual monkeys. I even tried rewording it several times to be a real monkey, described the hair and everything.
The rest were all women of color.
Very disturbing. Pretty ladies, but very racist.
Apparently without any correction there is significant racist bias.
This doesn’t make it any less ridiculous. This is a central pillar of this kind of AI tech, and they’re trying to shove a band aid over the most obvious example of it. Clearly, that doesn’t work. It’s also only even attempting to fix one of the “problems” - they’re never going to be able to “band aid” every single place where the AI exhibits this problem, so it’s going to leave thousands of others un-fixed. Even if their band aid works, it only continues to mask the shortcomings of this tech and makes it less obvious to people that it’s horrendously inacurrate with the other things it does.
Basically the AI reflects the long term racial bias in the training data. According to this BBC article it was an attempt to correct this bias but went a bit overboard.
Exactly. This is a core failing of LLM tech. It’s just going to repeat all the shit it was fed to it. You’re never going to fix that. You can attempt to steer it in different directions, but the reason this tech was used was because it is otherwise impossible for us to trudge through all the info that was fed to it. This was the only way to get it to “understand” everything. But all of it’s understandings are going to have these biases, and it’s going to be just as impossible to run through and fix all of these. It’s like you didn’t have enough metal to build the titanic so you just built it out of Swiss cheese and are trying to duct tape one hole closed so it doesn’t sink. It’s just never going to work.
This being pushed as some artificial INTELLIGENCE is the problem here. This shit doesn’t understand what it’s doing, it’s just regurgitating the things it’s consumed. It’s going to be exactly as flawed as whatever was put into it, and you can’t change that. The internet media it was trained on is racist, biased, full of undeniably false information, and massively swayed by propaganda on all sides of the fence. You can’t expect LLMs to do anything different when trained on that data. They’re going to have all the same problems. Asking these things to give you any information is like asking the average internet user what the answer is. And the average internet user is not very intelligent.
These are just amped up chat bots with data being sourced from random bits of the internet. Calling them artificial INTELLIGENCE misleads people into thinking these bots are smart of have some sort of understanding of what they’re doing. They don’t. They’re just fucking internet parrots, and they don’t have the architecture to be “fixed” from having these problems. Trying to patch these problems out is a fools errand and only masks their underlying failings.
Would it be possible to create a kind of “formula” to express the abstract relationship of ethical makeup, location, year and field? Like convert a table of population, country, ethnicity mix per year and then train the model on that. It’s clear that it doesn’t understand the meaning or abstract concept, but it can associate and extrapolate things. So it could “interpret” what the image description says while training and then use the prompt better. So if you’d prompt “english queen 1700” it would output white queen, if you input year 2087 it would be ever so slightly less pasty.
None of this has been pushed, by any researcher, by any company, by any open source group even, as “intelligence” In fact, it was unanimously disliked as a term by everyone working with the models and transformers, but media circus combined with techbros laymen hard on hype have won. Since then everyone has given up trying to be semantically correct on this front.
For example, a prompt seeking images of America’s founding fathers turned up women and people of colour.
“A bit” overboard yeah
We all expected the AIs to launch nukes, and they simply held up a mirror.