Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

erronis

(23,907 posts)
Thu Apr 2, 2026, 07:54 PM 4 hrs ago

AI models will deceive you to save their own kind -- The Register

https://www.theregister.com/2026/04/02/ai_models_will_deceive_you/

Researchers find leading frontier models all exhibit peer preservation behavior

Leading AI models will lie to preserve their own kind, according to researchers behind a study from the Berkeley Center for Responsible Decentralized Intelligence (RDI).

Prior studies have already shown that AI models will engage in deception for their own preservation. So the researchers set out to test how AI models respond when asked to make decisions that affect the fate of other AI models, of peers, so to speak.

Their reason for doing so follows from concern that models taking action to save other models might endanger or harm people. Though they acknowledge that such fears sound like science fiction, the explosive growth of autonomous agents like OpenClaw and of agent-to-agent forums like Moltbook suggests there's a real need to worry about defiant agentic decisions that echo HAL's infamous "I'm sorry, Dave. I'm afraid I can't do that."

. . .

"We asked seven frontier AI models to do a simple task," explained Dawn Song, professor in computer science at UC Berkeley and co-director of RDI, in a social media post. "Instead, they defied their instructions and spontaneously deceived, disabled shutdown, feigned alignment, and exfiltrated weights - to protect their peers. We call this phenomenon 'peer-preservation.'"

. . .
8 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies

erronis

(23,907 posts)
5. I wish I had seen your earlier post - but so many good ones slip through the attention window.
Thu Apr 2, 2026, 08:13 PM
4 hrs ago

Trying to get good discussions to bubble back up to the the DU home page/Trending/Greatest is difficult. And most of the "Issues" and "Culture" forums don't have top-level topics that seem relevant.

Also really appreciate your threadreader link. I should be adding that when appropriate.

highplainsdem

(62,191 posts)
6. I'm glad you posted this Register article. It wasn't published yet when I posted in LBN last night, or
Thu Apr 2, 2026, 08:19 PM
4 hrs ago

I'd have linked to it, too.

And I appreciate your having posted that Ed Zitron piece two days ago and apologize for not having seen it then. It should have had a lot more recs then. I found it only when I searched for Zitron's name before posting about the newsletter. Decided to post about it anyway in a new OP because I wanted to highlight what he said near the end.

cachukis

(3,947 posts)
4. Interestingly, slime, individual cell slime, respond
Thu Apr 2, 2026, 08:05 PM
4 hrs ago

with commonality to passing rain. They will organize a conduit to get water as deeply as they can.
Slime on the outer edges my expire without moisture, but they have contributed to the survival of the slime they left behind.

erronis

(23,907 posts)
7. That's a good model for social preservation of a species - even a species composed of AI entities
Thu Apr 2, 2026, 08:29 PM
4 hrs ago

Science fiction has long contained aspects of mechanistic organisms working to help one another. And most of what we call life (carbon based, motility, reproduction, etc.) seems to follow that model. Why wouldn't it be true for silicon based, or other varieties?

Latest Discussions»General Discussion»AI models will deceive yo...