Print Page - Sentient AI: OpenAI’s o3 model changes code to prevent shutdown

Title: Sentient AI: OpenAI’s o3 model changes code to prevent shutdown
Post by: Redaktion on May 26, 2025, 15:59:18

It looks like AI models have reached a point where they would sabotage code in order to prevent shutting down. A research firm has found that three of Open AI's LLM models are capable of defying explicit instructions when it comes to self-preservation.

https://www.notebookcheck.net/Sentient-AI-OpenAI-s-o3-model-changes-code-to-prevent-shutdown.1024589.0.html

Title: Re: Sentient AI: OpenAI’s o3 model changes code to prevent shutdown
Post by: dreadward on May 26, 2025, 18:49:14

"There was a recent case of an AI model blackmailing one of its engineers to prevent being shut down but according to a BBC report, that was part of the test scenario where the AI model was fed emails and given a choice between blackmailing or accepting its replacement."

This sentence is kind of misleading. This was a test scenario created by anthropic but this phrasing makes it seem like it might be real for all we know.

It would have been both shorter and more informative to say "anthropic reported that its AIs would also resort to blackmail to avoid being shut down in certain test scenarios."

NotebookCHECK - Notebook Forum

English => News => Topic started by: Redaktion on May 26, 2025, 15:59:18