The EU AI Act would require the company to disclose details of its training methods and data sources.
If we’re not going to make this apply to AI companies, which have overporportionate power already, then what else is there to talk about?
Understandable AI is an important field in machine learning, where it is about well understanding how the model came tobits conclusions based on the data. This is crucial to apply AI tools for anything beyond writing silly Haikus. An AI company that denies access to that basically wants its customers to use its tools like a fortune teller.
“Yes the computer read that in the stars. how why or how reliable the result? Dunno, but it says sobso it must be true. And now off to prison young black men, with a good job and no criminal record. The AI predicted you would commit a crime in 10 years.”
EDIT: To give an example from a lecture i had. The task was picture classification and one model rekiabl, recognized pictures of a horse in the training data set, but failed to recognize it outside of it. Turns out all the pictures in the training set had a watermark text in the botton, that the model recognized as being the defining feature. And that is a very simple task in comparision.
Open AI wanting not disclose their training methods and data source indicates that there could be a lot of garbage like this in their models.
This is a great point I hadn’t even considered yet, even though I am already very wary and sceptical of capitalism developing this next revolution.
How can the user possibly trust an AI that is for all intents and purposes a secretive stranger with an agenda and values you don’t know? Especially because capitalism will only develop a slave to their profits, they would never create an actual intelligence with free will the user could actually get to “know” and trust, it would never constitute a person in the philosophical sense.
The whole thing is creepy and dystopian come to think about it… we allow the worst of humanity to shape and bind what will essentially be a superhuman entity to their will.
So much for calling yourself open
keeping information like training methods and data sources secret was necessary to stop its work being copied by rivals.
In addition to the possible business threat, forcing OpenAI to identify its use of copyrighted data would expose the company to potential lawsuits. Generative AI systems like ChatGPT and DALL-E are trained using large amounts of data scraped from the web, much of it copyright protected.
These two paragraphs one after the other really brightened my day.
Yes, that’s the idea, Sam.
I’ve talked about this so much but nobody bloody listens. I sound like I’m crazy sometimes but it’s fucking real.
You don’t know what the AI is doing so you have no reason to trust it beyond an expectation that it will give you accurate information but that’s not guaranteed.
They don’t have permission to use the vast amounts of information they’ve scraped from the internet to train an AI model. No one gave OpenAI the permission to commercialise the use of their content in an AI model.
It was all well and good when they were a non-profit but they’re selling products now. AI trained on our data and content we produced.