cross-posted from: https://lemmy.ml/post/14869314

“I want to live forever in AI”

You are viewing a single thread.
View all comments View context
1 point
*

I’m sorry. Now it gets completely false…

Read the first paragraph of the Wikipedia article on machine learning or the introduction of any of the literature on the subject. The “generalization” includes that model building capability. They go a bit into detail later. They specifically mention “to unseen data”. And “leaning” is also there. I don’t think the Wikipedia article is particularly good in explaining it, but at least the first sentences lay down what it’s about.

And what do you think language and words are for? To transport information. There is semantics… Words have meanings. They name things, abstract and concrete concepts. The word “hungry” isn’t just a funny accumulation of lines and arcs, which statistically get followed by other specific lines and arcs… There is more to it. (a meaning.)

And this is what makes language useful. And the generalization and prediction capabilities is what makes ML useful.

How do you learn as a human when not from words? I mean there are a few other posibilities. But an efficient way is to use language. You sit in school or uni and someone in the front of the room speaks a lot of words… You read books and they also contain words?! And language is super useful. A lion mother also teaches their cubs how to hunt, without words. But humans have language and it’s really a step up what we can pass down to following generations. We record knowledge in books, can talk about abstract concepts, feelings, ethics, theoretical concepts. We can write down how gravity and physics and nature works, just with words. That’s all possible with language.

I can look it up if there is a good article explaining how learning concepts works and why that’s the fundamental thing that makes machine learning a field in science… I mean ultimately I’m not a science teacher… And my literature is all in German and I returned them to the library a long time ago. Maybe I can find something.

Are you by any chance familiar with the concept of embeddings, or vector databases? I think that showcases that it’s not just letters and words in the models. These vectors / embeddings that the input gets converted to, match concepts. They point at the concept of “cat” or “presidential speech”. And you can query these databases. Point at “presidential speech” and find a representation of it in that area. Store the speech with that key and find it later on by querying it what obama said at his inauguration… That’s oversimplified but maybe that visualizes it a bit more that it’s not just letters of words in the models, but the actual meanings that get stored. Words get converted into an (multidimensional) vector space and it operates there. These word representations are called “embeddings” and transformer models which is the current architecture for large language models, use these word embeddings.

Edit: Here you are: https://arxiv.org/abs/2304.00612

permalink
report
parent
reply
2 points

The “learning” in a LLM is statistical information on sequences of words. There’s no learning of concepts or generalization.

And what do you think language and words are for? To transport information.

Yes, and humans used words for that and wrote it all down. Then a LLM came along, was force-fed all those words, and was able to imitate that by using big enough data sets. It’s like a parrot imitating the sound of someone’s voice. It can do it convincingly, but it has no concept of the content it’s using.

How do you learn as a human when not from words?

The words are merely the context for the learning for a human. If someone says “Don’t touch the stove, it’s hot” the important context is the stove, the pain of touching it, etc. If you feed an LLM 1000 scenarios involving the phrase “Don’t touch the stove, it’s hot”, it may be able to create unique dialogues containing those words, but it doesn’t actually understand pain or heat.

We record knowledge in books, can talk about abstract concepts

Yes, and those books are only useful for someone who has a lifetime of experience to be able to understand the concepts in the books. An LLM has no context, it can merely generate plausible books.

Think of it this way. Say there’s a culture where instead of the written word, people wrote down history by weaving fabrics. When there was a death they’d make a certain pattern, when there was a war they’d use another pattern. A new birth would be shown with yet another pattern. A good harvest is yet another one, and so-on.

Thousands of rugs from that culture are shipped to some guy in Europe, and he spends years studying them. He sees that pattern X often follows pattern Y, and that pattern Z only ever seems to appear following patterns R, S and T. After a while, he makes a fabric, and it’s shipped back to the people who originally made the weaves. They read a story of a great battle followed by lots of deaths, but surprisingly there followed great new births and years of great harvests. They figure that this stranger must understand how their system of recording events works. In reality, all it was was an imitation of the art he saw with no understanding of the meaning at all.

That’s what’s happening with LLMs, but some people are dumb enough to believe there’s intention hidden in there.

permalink
report
parent
reply
1 point

people wrote down history by weaving fabric […]

Hmm. I think in philosophy that thought experiment is known as chinese room

permalink
report
parent
reply
2 points

Yeah, that’s basically the idea I was expressing.

Except, the original idea is about “Understanding Chinese”, which is a bit vague. You could argue that right now the best translation programs “understand chinese”, at least enough to translate between Chinese and English. That is, they understand the rules of Chinese when it comes to subjects, verbs, objects, adverbs, adjectives, etc.

The question is now whether they understand the concepts they’re translating.

Like, imagine the Chinese government wanted to modify the program so that it was forbidden to talk about subjects that the Chinese government considered off-limits. I don’t think any current LLM could do that, because doing that requires understanding concepts. Sure, you could ban key words, but as attempts at Chinese censorship have shown over the years, people work around word bans all the time.

That doesn’t mean that some future system won’t be able to understand concepts. It may have an LLM grafted onto it as a way to communicate with people. But, the LLM isn’t the part of the system that thinks about concepts. It’s the part of the system that generates plausible language. The concept-thinking part would be the part that did some prompt-engineering for the LLM so that the text the LLM generated matched the ideas it was trying to express.

permalink
report
parent
reply

Programmer Humor

!programmer_humor@programming.dev

Create post

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

  • Keep content in english
  • No advertisements
  • Posts must be related to programming or programmer topics

Community stats

  • 3.4K

    Monthly active users

  • 1K

    Posts

  • 38K

    Comments