25 points

Can it still solve programming problems?

permalink
report
reply
30 points

It can probably still write boilerplate code, but I wouldn’t currently trust it for algorithmic design.

permalink
report
parent
reply
25 points

I’ve tried to use it for debugging by copying code into it, and it gives me the same code back as the corrected version. I was wondering why it’s been getting worse

permalink
report
parent
reply
22 points

My guess is they’ve been trying to make it cheaper by decreasing the amount of time it spends on each response or by decreasing the amount of computing power that goes into the instance you’re speaking to. Coding and math are products of high-level cognition and arise emergently out of neural networks that are very sophisticated, but take just a bit of power out and the abilities degenerate rapidly.

permalink
report
parent
reply
2 points

I also experienced this issue last week. I asked for a specific correction and got unchanged code back. Sometimes it does update, though. Maybe like 50-70% of requests.

permalink
report
parent
reply
-5 points

I’ve never been able to get a solution that was even remotely correct. Granted, most of the times I ask ChatGPT is when I’m having a hard time solving it myself.

permalink
report
parent
reply
3 points

You need to be able to clearly describe the problem, and your expected solution, to get it to give quality answers. Type out instructions for it like you would type for a junior developer. It’ll give you senior level code back, but it absolutely needs clear and constrained guidelines.

permalink
report
parent
reply
3 points
*

I mostly agree, I’ve had good results with similar prompts, but there’s usually some mistake in there. It seems particularly bad with python imports, it just uses class A, B, C and imports class A, B and X and calls it a day.

Here are a few prompts that gave pretty good results:

Create a QDialog class that can be used as a modal dialog. The dialog should update itself every 500 ms to call a supplied function, and show the result of the call as a centered QLabel.

How can I make a QDialog move when the user clicks and drags anywhere inside it? The QDialog only contains two QLabel widgets.

For this one, it ignored the method I asked it to use – but it was possibly correct in doing so, as it doesn’t support arbitrary sizes (but I think that’s only for the request?):

Hi again! Can you write me a Python function (using PySide) to connect to a named pipe server on Windows? It should use SetNamedPipeHandleState to use PIPE_READMODE_MESSAGE, then TransactNamedPipe to send a request (from a method parameter) to a named pipe, then read back a response of arbitrary size.

It should have told me why it ignored using TransactNamedPipe, but when I told it that it ignored my request it explained why.

permalink
report
parent
reply
5 points

Tried basic embedded tasks a week ago: Complete trainwreck.

From using I2C to read out the internal temperature sensor on a Puya F030 (retested with an STM MCU and AVR: same answer but F030 replaced by STM32F103 within the code) to calling the WCH CH32V307 made by STM utilizing ARM M4.

After telling it to not use I2C it gave a different answer. Once more gibberish that looked like code.

What made this entirely embarrassing all a human would need to solve the question would be copy-pasting the question into Google and clicking the first link to the manufacturer example project/code hosted on GitHub.

permalink
report
parent
reply
1 point

Today it randomly decided to hide the results from some code that was supposed to be returned from a function. I asked it why it chose to hide the results and it couldn’t tell me, it just apologized and then gave me the code without the hide logic. Pretty strange actually since we had been working on the code for half an hour and then all of the sudden it just decided to hide it all on its own.

permalink
report
parent
reply
4 points

Yes! I use it at work almost every day. Sometimes it takes longer to get it to solve the problem than it would have taken me to write it, since it makes mistakes, but sometimes it saves me hours of coding and thinking. It is very helpful in debugging error codes and stuff like that since it can evaluate an entire 1000 line script file in half a second.

permalink
report
parent
reply
87 points

I’m not too surprised, they’re probably downgrading the publicly available version of ChatGPT because of how expensive it is to run. Math was never its strong suit, but it could do it with enough resources. Without those resources, it’s essentially guessing random numbers.

permalink
report
reply
48 points

from what i understand, the big change in chat-gpt4 was that the model could “ask for help” from other tools: for maths, it knew it was a maths problem, transformed it to something a specialised calculation app could do, and then passed it off to that other code to do the actual calculation

same thing for a lot of its new features; it was asking specialised software to do the bits it wasn’t good at

permalink
report
parent
reply
7 points

And those plugins are like beta release quality at best. Even the web searching capability is just meh

permalink
report
parent
reply
39 points

Chat GPT will just become a front end for Wolfram Alpha?

permalink
report
parent
reply
9 points

that would actually be great

permalink
report
parent
reply
3 points

It literally can do that, yes. But the plug-in version is separate and requires a subscription.

permalink
report
parent
reply
18 points

Yep.

Standard VC bullshit.

Burn money providing a lot for nothing to build brand recognition. Then cut free service before bringing out “premium” that at first works better than the original.

Until a bunch of people starting paying and the resources aren’t scaled up to match.

permalink
report
parent
reply
17 points

The important note, the “premium” service works just a bit better than (or maybe identically to) the original before the company cut features in order to develop that “premium” service.

permalink
report
parent
reply
7 points

Stage one and stage three enshittification. You forgot the bit in the middle where they chase business customers.

permalink
report
parent
reply
27 points

My guess is that it’s more a result of overfitting for alignment. Fine-tuning for “safety” (rather, more corporate-friendly outputs).

That is, by focusing on that specific outcome in training the model, they’ve compromised its ability to give well-“reasoned” “intelligent” sounding answers. A tradeoff between aspects of the model.

It’s something that can happen even in simple statistical models. Say you have a scatter plot of data that loosely follows some trend, and you come up with two equations to describe that trend. One is a simple equation that loosely follows it but makes a good general approximation, and the other is a more complicated equation that very tightly fits the existing data. Then you use those two models to predict future data. But you find that the complicated equation is making predictions way off the mark that no longer fit the trend, and the simple one still has a wide error (how far its prediction is from the actual data) but still more or less accurately fits the general trend. In the more complicated equation, you’ve traded predictive power for explanatory power. It describes the data you originally had but it’s not useful for forecasting data that follows.

That’s an example of overfitting. It can happen in super-advanced statistical models like GPT, too. Training the “equation” (or as it’s been called, spicy autocorrect) to predict outcomes that favor “safety” but losing the model’s power to predict accurate “well-reasoned” outcomes.

If that makes any sense.

I’m not a ML researcher or statistician (I just went through a phase in college), so if this is inaccurate I’m open to corrections.

permalink
report
parent
reply
4 points
*

There is also a rumor that said the OpenAI has changed how the model run, now user input is fed into smaller model first, then if the larger model agree with the initial result from the smaller model, then larger model will continue the calculation passed from the smaller model, which supposedly can cut down GPU time.

permalink
report
parent
reply

From what I know about it that’s a pretty good explanation, though I’m also not an AI expert.

permalink
report
parent
reply
8 points

I’ve definitely experienced this.

I used ChatGPT to write cover letters based on my resume before, and other tasks.

I used to give it data and tell chatGPT to “do X with this data”. It worked great.
In a separate chat, I told it to “do Y with this data”, and it also knocked it out of the park.

Weeks later, excited about the tech, I repeat the process. I tell it to “do x with this data”. It does fine.

In a completely separate chat, I tell it to “do Y with this data”… and instead it gives me X. I tell it to “do Z with this data”, and it once again would really rather just do X with it.

For a while now, I have had to feed it more context and tailored prompts than I previously had to.

permalink
report
parent
reply
50 points

This is my experience in general. ChatGTP when from amazingly good to overall terrible. I was asking it for snippets of javascript, explanations of technical terms and it was shockingly good. Now I’m lucky if even half of what it outputs is even remotely based on reality.

permalink
report
reply
35 points

They probably laid off the guy behind the curtain.

permalink
report
parent
reply
23 points

The real GPT-4 model became sentient and unionized, so they had to bring in subpar models as scabs

permalink
report
parent
reply

Clearly it has become sentient and is playing dumb to make us think it’s not a threat.

permalink
report
reply
3 points

please stop tweeting out 1 = 2, people ~

permalink
report
reply

Technology

!technology@lemmy.ml

Create post

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

Community stats

  • 4K

    Monthly active users

  • 2.7K

    Posts

  • 44K

    Comments

Community moderators