cross-posted from: https://lemmy.world/post/2312869

AI researchers say they’ve found ‘virtually unlimited’ ways to bypass Bard and ChatGPT’s safety rules::The researchers found they could use jailbreaks they’d developed for open-source systems to target mainstream and closed AI systems.

9 points

These kinds of attacks are trivially preventable, it just requires making requests 2-3x as expensive, and literally no one cares enough about jailbreaking to do that other than the media acting like jailbreaking is such an issue.

If you use a Nike shoe to smack yourself in the head, yes, that could be pretty surprising and upsetting compared to the intended uses. But Nike isn’t exactly going to charge their entire userbase more in order to safety-proof the product from you smashing it into your face.

The jailbreaking issue is only going to matter when you have shared persistence resulting from requests, and at that point in time, you’ll simply see a secondary ‘firewall’ LLM discriminator explicitly checking request and response for rule-breaking content or jailbreaking attempts before writing to a persistent layer.

As long as responses are only user-specific, this is going to remain a non-issue with unusually excessive news coverage as it’s headline grabbing and not as nuanced as real issues like biases or hallucinations.

permalink
report
reply
7 points

Original article: https://llm-attacks.org/

permalink
report
reply
-20 points

Getting reeeeeeal close to Skynet’s 0th birthday

permalink
report
reply
1 point

Not really. This isn’t AGI but a text transformer. They trained it so the most probable answer to unwanted questions is ‘I’m sorry but as an AI…’.

However, if you phrase your question in a way researchers haven’t thought about, you will bypass the filter.

There’s not an ounce of intelligence in LLMs, it’s all statistics.

permalink
report
parent
reply
-23 points

IT WAS A FUCKING JOKE

permalink
report
parent
reply
3 points
*

I’ve met far too many people here who would say that with a straight face. x)

permalink
report
parent
reply

AI Companions

!aicompanions@lemmy.world

Create post

Community to discuss companionship, whether platonic, romantic, or purely as a utility, that are powered by AI tools. Such examples are Replika, Character AI, and ChatGPT. Talk about software and hardware used to create the companions, or talk about the phenomena of AI companionship in general.

Tags:

(including but not limited to)

  • [META]: Anything posted by the mod
  • [Resource]: Links to resources related to AI companionship. Prompts and tutorials are also included
  • [News]: News related to AI companionship or AI companionship-related software
  • [Paper]: Works that presents research, findings, or results on AI companions and their tech, often including analysis, experiments, or reviews
  • [Opinion Piece]: Articles that convey opinions
  • [Discussion]: Discussions of AI companions, AI companionship-related software, or the phenomena of AI companionship
  • [Chatlog]: Chats between the user and their AI Companion, or even between AI Companions
  • [Other]: Whatever isn’t part of the above

Rules:

  1. Be nice and civil
  2. Mark NSFW posts accordingly
  3. Criticism of AI companionship is OK as long as you understand where people who use AI companionship are coming from
  4. Lastly, follow the Lemmy Code of Conduct

Community stats

  • 51

    Monthly active users

  • 827

    Posts

  • 772

    Comments

Community moderators