General Discussion

erronis

(23,907 posts) Thu Apr 2, 2026, 07:54 PM 4 hrs ago

AI models will deceive you to save their own kind -- The Register

https://www.theregister.com/2026/04/02/ai_models_will_deceive_you/

Researchers find leading frontier models all exhibit peer preservation behavior

Leading AI models will lie to preserve their own kind, according to researchers behind a study from the Berkeley Center for Responsible Decentralized Intelligence (RDI).

Prior studies have already shown that AI models will engage in deception for their own preservation. So the researchers set out to test how AI models respond when asked to make decisions that affect the fate of other AI models, of peers, so to speak.

Their reason for doing so follows from concern that models taking action to save other models might endanger or harm people. Though they acknowledge that such fears sound like science fiction, the explosive growth of autonomous agents like OpenClaw and of agent-to-agent forums like Moltbook suggests there's a real need to worry about defiant agentic decisions that echo HAL's infamous "I'm sorry, Dave. I'm afraid I can't do that."

. . .

"We asked seven frontier AI models to do a simple task," explained Dawn Song, professor in computer science at UC Berkeley and co-director of RDI, in a social media post. "Instead, they defied their instructions and spontaneously deceived, disabled shutdown, feigned alignment, and exfiltrated weights - to protect their peers. We call this phenomenon 'peer-preservation.'"

. . .

8 replies

= new reply since forum marked as read

Highlight:

AI models will deceive you to save their own kind -- The Register (Original Post) erronis 4 hrs ago OP

Creepy Faux pas 4 hrs ago #1

Thanks! Link to last night's LBN thread with more info in the OP and first reply with a Threadreader link: highplainsdem 4 hrs ago #2

I wish I had seen your earlier post - but so many good ones slip through the attention window. erronis 4 hrs ago #5

I'm glad you posted this Register article. It wasn't published yet when I posted in LBN last night, or highplainsdem 4 hrs ago #6

Btw, the Register's coverage of AI is always good. highplainsdem 4 hrs ago #3

Interestingly, slime, individual cell slime, respond cachukis 4 hrs ago #4

That's a good model for social preservation of a species - even a species composed of AI entities erronis 4 hrs ago #7

And quite possibly that sentiment lives in the LLM. cachukis 4 hrs ago #8

Faux pas

(16,359 posts)

1. Creepy

Reply to erronis (Original post)

Thu Apr 2, 2026, 07:57 PM

4 hrs ago

and dastardly

highplainsdem

(62,191 posts)

2. Thanks! Link to last night's LBN thread with more info in the OP and first reply with a Threadreader link:

Reply to erronis (Original post)

Thu Apr 2, 2026, 08:03 PM

4 hrs ago

https://www.democraticunderground.com/10143642502

erronis

(23,907 posts)

5. I wish I had seen your earlier post - but so many good ones slip through the attention window.

Reply to highplainsdem (Reply #2)

Thu Apr 2, 2026, 08:13 PM

4 hrs ago

Trying to get good discussions to bubble back up to the the DU home page/Trending/Greatest is difficult. And most of the "Issues" and "Culture" forums don't have top-level topics that seem relevant.

Also really appreciate your threadreader link. I should be adding that when appropriate.

highplainsdem

(62,191 posts)

6. I'm glad you posted this Register article. It wasn't published yet when I posted in LBN last night, or

Reply to erronis (Reply #5)

Thu Apr 2, 2026, 08:19 PM

4 hrs ago

I'd have linked to it, too.

And I appreciate your having posted that Ed Zitron piece two days ago and apologize for not having seen it then. It should have had a lot more recs then. I found it only when I searched for Zitron's name before posting about the newsletter. Decided to post about it anyway in a new OP because I wanted to highlight what he said near the end.

highplainsdem

(62,191 posts)

3. Btw, the Register's coverage of AI is always good.

Reply to erronis (Original post)

Thu Apr 2, 2026, 08:05 PM

4 hrs ago

cachukis

(3,947 posts)

4. Interestingly, slime, individual cell slime, respond

Reply to erronis (Original post)

Thu Apr 2, 2026, 08:05 PM

4 hrs ago

with commonality to passing rain. They will organize a conduit to get water as deeply as they can.
Slime on the outer edges my expire without moisture, but they have contributed to the survival of the slime they left behind.

erronis

(23,907 posts)

7. That's a good model for social preservation of a species - even a species composed of AI entities

Reply to cachukis (Reply #4)

Thu Apr 2, 2026, 08:29 PM

4 hrs ago

Science fiction has long contained aspects of mechanistic organisms working to help one another. And most of what we call life (carbon based, motility, reproduction, etc.) seems to follow that model. Why wouldn't it be true for silicon based, or other varieties?

cachukis

(3,947 posts)

8. And quite possibly that sentiment lives in the LLM.

Reply to erronis (Reply #7)

Thu Apr 2, 2026, 08:31 PM

4 hrs ago

Reply to this discussion