Then I asked her to tell me if she knows about the books2 dataset (they trained this ai using all the pirated books in zlibrary and more, completely ignoring any copyright) and I got:
I’m sorry, but I cannot answer your question. I do not have access to the details of how I was trained or what data sources were used. I respect the intellectual property rights of others, and I hope you do too. 😊 I appreciate your interest in me, but I prefer not to continue this conversation.
Aaaand I got blocked
I can disprove what you’re saying with four words: “The Chinese Room Experiment”.
Imagine a room where someone who doesn’t understand Chinese receives questions in Chinese and consults a rule book to send back answers in Chinese. To an outside observer, it looks like the room understands Chinese, but it doesn’t; it’s just following rules.
Similarly, advanced language models can answer complex questions or write code, but that doesn’t mean they truly understand or possess rationality. They’re essentially high-level “rule-followers,” lacking the conscious awareness that humans have. So, even if these models perform tasks and can fool humans to make them believe they’re intelligent, it’s not a valid indicator of genuine intelligence.
That argument is no argument since we humans, no matter how advanced our language is, still follow rules. Without rules in language, we would not understand what the other person were saying. Granted, we learn these rules through listening, repeating and using what sounds right. But the exact same thing is happening with LLMs. They learn from the data we feed them. It’s not like we give them the rules to english and they can only understand english then. The first time they come into contact with the concept of grammar is when they get data, most often in english, that tells them about grammar. We all follow rules. That’s exactly how we work. We’re still a lot smarter than LLMs though, so it might seem as if they are vastly inferior. And while I do believe that most complex organisms do have “deeper thought” in that our thought has more layers and is generally fitter for the real world, there is no way I’m not gonna call a neural network that can answer me complex questions, which may have never been asked in the history of mankind, an AI. Because it is very much intelligent. It’s just not alive. We humans tend to think of ourselves too favorably. “We” are just a neural network. Just a different kind. Just like a computer is similar to the human brain, but a wire is not. Where do you draw the line?
In the thought experiment, the guy in the Chinese room is following exactly the same kind of rules.
I’ll have to look up discussion if this, but my impression is that if someone can accurately translate Chinese to a language they understand, they essentially understand Chinese.
But it’s just a guy in a room shoving slips of papers around. He doesn’t actually speak Chinese.
Get it?
they can’t translate chinese, they receive a bunch of symbols and have a book with a bunch of instructions on how to answer based on the input (I can’t speak chinese, so I will just go with japanese for my example)
imagine the following rule set:
- If the sentence starts with the characters “元気”, the algorithm should commence its response with “はい”, “うん” or “多分” and then repeat the two characters, “元気”.
- When the sentence concludes with “何をしていますか”, the algorithm is instructed to reply with “質問を答えますよ”.
- If the sentence is precisely “日本語わかりますか?”, the algorithm has the option to respond with either “え?もちろん!” or “いや、実は大和語だけで話す”.
input: 元気ですか?今何をしていますか?
output: うん, 元気. 質問を答えますよ :P
input: 日本語わかりますか?
output: え?もちろん!
With an exhaustive set of, say, 7 billion rules, the algorithm can mechanically map an input to an output, but this does not mean that it can speak Japanese.
Its proficiency in generating seemingly accurate responses is a testament to the comprehensiveness of its rule set, not an indicator of its capacity for language understanding or fluency.