12

Two-faced AI language models learn to hide deception - ‘Sleeper agents’ seem benign during testing but behave differently once deployed. And methods to stop them aren’t working.(www.nature.com)

posted 9 months ago

by

Lugh@futurology.todayM

in

futurology@futurology.today

9 commentshide report

Sort:

Hot Top Controversial New Old

You are viewing a single thread.

View all comments View context

[ - ]

mateomaui@reddthat.com

2 points

9 months ago

Alright, I’ll switch to digging holes for the family burial ground.

report

reply

Futurology

!futurology@futurology.today

Community stats

3.2K
Monthly active users
1.1K
Posts
7.7K
Comments

Community moderators