Do Users Write More Insecure Code with AI Assistants?(chaos.social)

posted 10 months ago

ericjmorey@programming.dev

programming@beehaw.org

17 commentshide report

cross-posted from: https://programming.dev/post/8121843

~n (@nblr@chaos.social) writes:

This is fine…

“We observed that participants who had access to the AI assistant were more likely to introduce security vulnerabilities for the majority of programming tasks, yet were also more likely to rate their insecure answers as secure compared to those in our control group.”

[Do Users Write More Insecure Code with AI Assistants?](https://arxiv.org/abs/2211.03622?

Sort:

Hot Top Controversial New Old

[ - ]

Artyom@lemm.ee

21 points

10 months ago

Anyone who’s going to copy and paste code that they don’t understand is inherently a security vulnerability.

permalink

report

[ - ]

ericjmorey@programming.devOP

2 points

10 months ago

True

permalink

report

parent

[ - ]

t3rmit3@beehaw.org

21 points

10 months ago

This is just an extension of the larger issue of people not understanding how AI works, and trusting it too much.

AI is and has always been about exchanging accuracy for speed. It excels in cases where slow, methodical work is not given sufficient time already, because the accuracy is already low(er) as a result (e.g. overworked doctors examining CT scans).

But it should never be treated as the final word on something; it’s the first ~70%.

permalink

report

[ - ]

Scrubbles@poptalk.scrubbles.tech

11 points

10 months ago

I feel like I’ve been screaming this for so long and you’re someone who gets it. AI stuff right now is pretty neat. I’ll use it to get jumping off points and new ideas on how to build something.

I would never ever push something written by it to production without scrutinizing the hell out of it.

permalink

report

parent

[ - ]

Sonori@beehaw.org

7 points

10 months ago

Didn’t it turn out that the CT scan analysis thing was just the model figuring out the rough age of machine, becuse older machines tend to be in poorer places with more cancer and are more likely to only be used on serious illnesses?

permalink

report

parent

[ - ]

ericjmorey@programming.devOP

2 points

10 months ago

If taking into account the older machines results in better healthcare, that seems like a great thing to be discovered as a result of the use of machine learning.

Your summary sounds like it may be inaccurate, but it’s interesting enough for me to want to know more.

permalink

report

parent

[ - ]

Sonori@beehaw.org

4 points

10 months ago

I believe it was from a study on detecting Tuberculosis, but unfortunately google isn’t been very helpful for me.

The problem with that would be that people in poorer areas are more at risk from TB is not a new discovery, and a model which is intended and billed as detecting TB from a scan should ideally not be using a factor like hospital is old and poor to determine if a scan has diseased tissue, given that intrinsically means your model is more likely to miss it in patients at better hospitals while over-diagnosing it in poorer ones, and that of course at risk people can still go to newer hospitals.

A Doctor will take risk factors into consideration, but would also know that just because their hospital got a new machine doesn’t mean that their patients are now less likely to have a potentially fatal disease. This results in worse diagnosis, even if it technically scores better with the training set.

permalink

report

parent

Show more comments

[ - ]

ericjmorey@programming.devOP

4 points

10 months ago

It’s a decent first screen for pattern recognition for sure, but it is fast which is where I see most of its value. It can process information that people would never get to.

permalink

report

parent

[ - ]

Auzy@beehaw.org

16 points

10 months ago

This isn’t even a debate lol…

Stuff like CoPilot is awesome at making code that looks right, but contains subtle wrong variable names it’s self-created, or bad algorithms.

And that’s not the big issue.

The big issue is when you get distracted for 5 mins, you come back, and you forget that you’ve been working through that block of AI generated code (which looks correct), so you forget to check the rest of it, and it makes it into the source code, before testing later, only to realise its screwed because its AI generated code.

The other big issue, is that its only a matter of time until people start to get fed up, and start feeding these systems dodgy data to de-train them and make them worse / with backdoors.

permalink

report

[ - ]

The Bard in Green@lemmy.starlightkel.xyz

14 points

10 months ago

People are including AI generated code in their projects without fully reading it or understanding how it works.

permalink

report

[ - ]

ericjmorey@programming.devOP

13 points

10 months ago

The same ones that were blindly copying and pasting from StackOverflow previously found a more convenient way to make their code “work”.

permalink

report

parent

[ - ]

TheFriendlyArtificer@beehaw.org

14 points

10 months ago

My argument is thus:

LLMs are decent at boilerplate. They’re good at rephrasing things so that they’re easier to understand. I had a student who struggled for months to wrap her head around how pointers work, two hours with GPT and the ability to ask clarifying questions and now she’s rockin’.

I like being able to plop in a chunk of Python and say, “type annotate this for me and none of your sarcasm this time!”

But if you’re using an LLM as a problem solver and not as an accelerator, you’re going to lack some of the deep understanding of what happens when your code runs.

permalink

report

[ - ]

jherazob@beehaw.org

7 points

10 months ago

The thing is that this is NOT what the marketers are selling, they’re not selling this as “Buy access to our service so that your products will be higher quality”, they’re selling this as “this will replace many of your employees”. Which it can’t, it’s very clear by now that it just can’t.

permalink

report

parent

Programming

!programming@beehaw.org

Create post

All things programming and coding related. Subcommunity of Technology.

This community’s icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

Community stats

69
Monthly active users
320
Posts
3.3K
Comments

Community stats

Community moderators