Link to original tweet:
https://twitter.com/sayashk/status/1671576723580936193?s=46&t=OEG0fcSTxko2ppiL47BW1Q
Screenshot:
Transcript:
Iโd heard that GPT-4โs image analysis feature wasnโt available to the public because it could be used to break Captcha.
Turns out itโs true: The new Bing can break captcha, despite saying it wonโt: (image)
I love when it tells you it canโt do something and then does it anyway.
Or when it tells you that it can do something it actually canโt, and it hallucinates like crazy. In the early days of ChatGPT I asked it to summarize an article at a link, and it gave me a very believable but completely false summary based on the words in the URL.
This was the first time I saw wild hallucination. It was astounding.
Itโs even better when you ask it to write code for you, it generates a decent looking block, but upon closer inspection it imports a nonexistent library that just happens to do exactly what you were looking for.
Thatโs the best sort of hallucination, because it gets your hopes up.
Iโve not played with it much but does it always describe the image first like that? Iโve been trying to think about how the image input actually works, my personal suspicion is that it uses an off the shelf visual understanding network(think reverse stable diffusion) to generate a description, then just uses GPT normally to complete the response. This could explain the disconnect here where it cant erase what the visual model wrote, but that could all fall apart if it doesnโt always follow this pattern. Just thinking out loud here
They need to make captchas better or implement PoW. Telling your ai to not solve captchas is stupid and makes it dumber in unrelated tasks just like all the other attempts at censoring these models.
Does one need the app to upload an image? I use it in a web browser and donโt see any option to upload an image.