You are viewing a single thread.
View all comments View context
13 points

It will almost always be detectable if you just read what is written. Especially for academic work. It doesn’t know what a citation is, only what one looks like and where they appear. It can’t summarise a paper accurately. It’s easy to force laughably bad output by just asking the right sort of question.

The simplest approach for setting homework is to give them the LLM output and get them to check it for errors and omissions. LLMs can’t critique their own work and students probably learn more from chasing down errors than filling a blank sheet of paper for the sake of it.

permalink
report
parent
reply
1 point

Chad comment right here…

permalink
report
parent
reply
54 points

given how much AI has advanced in the past year alone, saying it will “always” be easy to spot is extremely short sighted.

permalink
report
parent
reply
1 point

Some things are inherent in the way the current LLM’s work. It doesn’t reason, it doesn’t understand, it just predicts the next word out of likely candidates based on the previous words. It can’t look ahead to know if it’s got an answer, and it can’t backtrack to change previous words if it later finds out it’s written itself into a corner. It won’t even know it’s written itself into a corner, it will just continue predicting in the pattern it’s seen, even if it makes little or no sense for a human.

It just mimics the source data it’s been trained on, following the patterns it’s learned there. At no point does it have any sort of understanding of what it’s saying. In some ways it’s similar to this, where a man learned how enough french words were written to win the national scrabble competition, without any clue what the words actually mean.

And until we get a new approach to LLM’s, we can only improve it by adding more training data and more layers allowing it to pick out more subtle patterns in larger amounts of data. But with the current approach, you can’t guarantee that what it writes will be correct, or even make sense.

permalink
report
parent
reply
5 points

it just predicts the next word out of likely candidates based on the previous words

An entity that can consistently predict the next word of any conversation, book, news article with extremely high accuracy is quite literally a god because it can effectively predict the future. So it is not surprising to me that GPT’s performance is not consistent.

It won’t even know it’s written itself into a corner

It many cases it does. For example, if GPT gives you a wrong answer, you can often just send an empty message (single space) and GPT will say something like: “Looks like my previous answer was incorrect, let me try again: blah blah blah”.

And until we get a new approach to LLM’s, we can only improve it by adding more training data and more layers allowing it to pick out more subtle patterns in larger amounts of data.

This says nothing. You are effectively saying: “Until we can find a new approach, we can only expand on the existing approach” which is obvious.

But new approaches come all the time! Advances in tokenization come all the time. Every week there is a new paper with a new model architecture. We are not stuck in some sort of hole.

permalink
report
parent
reply
25 points

People seem to grasp onto weaknesses AI has now and say that they will have them forever, like how text AI lies, and image generation AI can’t draw hands.

But these AIs are advancing unimaginably quick, 2 years ago generated text was pretty bad, becoming pretty incoherent, and 1 year ago generated images were mostly strange mush.

permalink
report
parent
reply
3 points

Spot on! Actually people still talk about hands but it’s already been solved with many newer image gen models… The hands they produce look perfectly fine usually these days.

permalink
report
parent
reply
24 points

This is not entirely correct, in my experience. With the current version pf gtp-4 you might be right, but the initial versions were extremely good. Clearly you have to work with it, you cannot ask for the whole work

permalink
report
parent
reply
4 points

That’s not true! There’s heaps of early-GPT articles pointing out how much bullshit it regurgitates (eg Why does ChatGPT constantly lie?). And no evidence at all that the breathless fanboys have even stopped to check.

permalink
report
parent
reply
7 points
*

I meant initial versions of chatGTP 4. ChatGTP isn’t lying, simply because lying implies a malevolent intent. Gtp-4 has no intent, it just provides an output given an input, that can be either wrong or correct. A model able to provide more correct answers is a more accurate model. Computing accuracy for a LLM is not trivial, but gpt-4 is still a good model. User has to know how to use it, what to expect and how to evaluate the result. If they are unable to do so it’s completely their fault.

Why are you so pissed of a good nlp model?

permalink
report
parent
reply
2 points

I’m no GPT booster, but I think that the real problem with detectability here

It will almost always be detectable if you just read what is written. Especially for academic work.

is that it requires you to know the subject and content already, and to be giving the paper a relatively detailed reading. For a rube reading the paper, trying to learn from it - a lot of GPT content is easily mistaken as legitimate. And it’s getting better. We’re not safe simply assuming that AI today is as good as it will ever get and the clear errors we can detect cannot ever be addressed.

Penetrating academic writing, for academics, is probably one of the highest barriers of any writing task, AI or not.

But being dismissive of the threat of AI content because it’s not able to convincingly fake some of the hardest writing that real people do is maybe sidestepping a lot of much more casual writing - that still carries significance and consequence.

permalink
report
parent
reply
8 points

I think there’s a big difference between being able to identify an AI by talking to it and being able to identify something written by an AI, especially if a human has looked over it for obvious errors.

permalink
report
parent
reply
3 points

What you are describing is true of older LLMs. GPT4, it’s less true of. GPT5 or whatever it is they are training now will likely begin to shed these issues.

The shocking thing that we discovered that lead to all of this is that this sort of LLM continues to scale in capabilities with the quality and size of the training set. AI researchers were convinced that this was not possible until GPT proved that it was.

So the idea that you can look at the limitations of the current generation of LLM and make blanket statements about the limitations of all future generations is demonstrably flawed.

permalink
report
parent
reply
2 points

They cannot be anything other than stochastic parrots because that is all the technology allows them to be. They are not intelligent, they don’t understand the question you ask or the answer they give you, they don’t know what truth is let alone how to determine it. They’re just good at producing answers that sound like a human might have written them. They’re a parlour trick. Hi-tech magic 8balls.

permalink
report
parent
reply
4 points

They cannot be anything other than stochastic parrots because that is all the technology allows them to be.

Are you referring to humans or AI? I’m not sure you’re wrong about humans…

permalink
report
parent
reply
8 points

LLMs can’t critique their own work

In many cases they can. This is commonly used to improve their performance: https://arxiv.org/abs/2303.11366

permalink
report
parent
reply
-1 points

*accurately

permalink
report
parent
reply
5 points

Whoops, meant to say: “In many cases, they can accurately (critique their own work)”. Thanks for correcting me!

permalink
report
parent
reply

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


Community stats

  • 18K

    Monthly active users

  • 12K

    Posts

  • 541K

    Comments