We all know by now that ChatGPT is full of incorrect data but I trusted it will no go wrong after I asked for a list of sci-fi books recommendations (short stories anthologies in Spanish mostly) including book names, editorial, print year and of course ISBN.
Some of the books do exist but the majority are nowhere to be found. I pick the one that caught my interest the most and contacted the editorial directly after I did not find it in their website or anywhere else.
This is what they replied (Google Translate):
ChatGPT got it wrong.
We don’t have any books with that title.
In the ISBN that has given you the last digit is incorrect. And the correct one (9788477028383) corresponds to “The Holy Fountain” by Henry James.
Nor have we published any science fiction anthologies in the last 25 years.
I quick search in the “old site” shows that others have experienced the same with ChatGPT and ISBN searches… For some reason I thought it will no go wrong in this case, but it did.
“I used a hammer to screw screws and it didn’t work.”
ChatGPT is a generative language model. It was not built for this kind of use case, it was not ever intended for this kind of use case, and the fact that it doesn’t succeed at this is like saying that it can’t make you a pizza. The only logical response is “well yeah, what did you expect it to do?”
This is just an anecdotal post, not a complain. No need to take it as seriously as you seem to take it… :)
I find it very useful as a “secretary” and to help me structure my thoughts. When I work on a document (usually technical) I first ask him to propose a document structure, explaining him what kind of document and on which subject. I then tell him how to improve his proposition.
Then I take each chapter and tell him to ask me questions to help me structure my reasoning. Then, I answer his questions and he writes the paragraph from my answers.
I find that it helps greatly, it allows me to write documents much faster and I also have the feeling it improves the quality. It allows me to sometimes think about things that I wouldn’t have on my own.
Yeah I’ve been noticing this lately too. It’s starting to dredge up random slapped-together information. Last week, for fun, I had it tell me the plot to an obscure N64 series I loved as a kid. I had it do this several times and even provided a full correct plot but the AI simply kept randomly putting information together and I’m not sure where much of it was coming from.
Having it generate book lists (like for learning a programming language) shows problems obvs. It would show me the same book across 4 editions, make up information about the books (unhelpful when you’re looking for a book with exercises in it), and just general…this didn’t feel like a problem a month ago. Maybe it was. I mostly use it to generate writing prompts.
Use Bing Chat for this kind of thing, it runs on GPT-4.
They’re real; they just haven’t been written yet.
I’m possibly just vomiting something you already know here, but an important distinction is that the problem isn’t that ChatGPT is full of “incorrect data”, it’s that it is has no concept of correct or incorrect, and it doesn’t store any data in the sense we think of it.
It is a (large) language model (LLM) which does one thing, albeit incredibly well: output a token (a word or part of a word) based on the statistical probability of that token following the previous tokens, based on a statistical model generated from all the data used to train it.
It doesn’t know what a book is, nor does it have any memory of any titles of any books. It only has connections between token, scored by their statistical probability to follow each other.
It’s like a really advanced version of predictive texting, or the predictive algorithm that Google uses when you start typing a search.
If you ask it a question, it only starts to string together tokens which form an answer because the network has been trained on vast quantities of text which have a question-answer format. It doesn’t know it’s answering you, or even what a question is; it just outputs the most statistically probable token, appends it to your input, and then runs that loop.
Sometimes it outputs something accurate - perhaps because it encountered a particular book title enough times in the training data, that it is statistically probable that it will output it again; or perhaps because the title itself is statistically probable (e.g. the title “Voyage to the Stars Beyond” will be much more statistically likely than “Significantly Nine Crescent Unduly”, even if neither title actually existed in the training data.
Lots of the newer AI services put different LLMs together, along with other tools to control output and format input in a way which makes the response more predictable, or even which run a network request to look up additional data (more tokens) but the most significant part of the underlying tech is still fundamentally unable to conceptualise the notion of accuracy, let alone ensure they uphold it.
Maybe there will be another breakthrough in another area of AI research of which LLMs will form an important part, but the hype train has been running hard to categorise LLMs as AI, which is disingenuous. Theyre incredibly impressive non-intelligent automatic text generators.
What would be your definition of intelligence if an chatgpt is not intelligence?
My definition would be something along the lines of the ability to use knowledge, ideas and concepts to solve a particular problem. For example if you ask “what should I do if I see a black bear approaching?” Both you and chatgpt would answer the question by using the knowledge that black bears can be scared off to come to the solution “make yourself look big and yell”
The only difference is the type of knowledge available. People can have experiential knowledge, eg. You saw a guy scare off a bear one time by yelling and waving their arms. Chatgpt doesn’t have that because it doesn’t have experiences. It does have contextual knowledge like us, you read or heard from someone that you can scare off a bear. This type of knowledge though is inherently probabilistic, the person who told you could always be giving false information. That doesn’t make you unintelligent for using it though and it doesn’t mean you don’t understand accuracy if it turns out to be false, it’s just that your brain made a guess that it was true that was wrong.
Just as a fun example of a really basic language model, here’s my phones predictive model answering your question. I put the starting tokens in brackets for illustration only, everything following is generated by choosing one of the three suggestions it gives me. I mostly chose the first but occasionally the second or third option because it has a tendency to get stuck in loops.
[We know LLMs are not intelligent because] they are not too expensive for them to be able to make it work for you and the other things that are you going to do.
Yeah it’s nonsense, but the main significant difference between this and an LLM is the size of the network and the quantity of data used to train it.