African Americans have a weak bias against writing in African American English -> Colleges have weak bias against accepting African Americans as graduate students -> Academic text have strong bias for text written by graduate students -> LLM training data has bias for academic texts -> LLMs have a strong bias for writing like training data.
The error occurs upstream a bit, don’t point at the coders.
Writing in AAVE is silly, just like someone from the Deep South including southern drawl in their writing would be, or someone from Boston spelling “car keys” as “kha kees”
So
African Americans have a weak bias against writing in African American English -> Colleges have weak bias against accepting African Americans as graduate students
Is a bit of a jump. Someone writing in AAVE probably wouldn’t get accepted to college, because written word is supposed to transcend dialects and follow a set of rules to be universally understandable.
Most LLMs support this. You just have to enable Jive mode.
Obligatory “I speak Jive” link:
Is this the new term for ebonics and is ebonics offensive now or inappropriate?
What about YTVE though?
So for those that didn’t read the article, it basically explains how LLMs have a negative connotation about AAE. When asked to associate words with AAE written phrases, it used words like “aggressive”. When given a normal English phrase and the same phrase but in AAE and then asked what jobs would suit this person, the LLM gave low income jobs for the AAE statement with broader options for the normal English one.
It’s a serious problem because people that naturally write in AAE are most likely getting worse results. It stems mostly from old rascist newspaper articles and similar things.