Post reply

The message has the following error or errors that must be corrected before continuing:: Warning: this topic has not been posted in for at least 120 days.
Unless you're sure you want to reply, please consider starting a new topic.

Name
Email
Subject
Message icon

Other options

Return to this topic
Don't use smileys

Verification:

Please leave this box empty:

Shortcuts: ALT+S post or ALT+P preview

Topic summary

Posted by dreadward

- May 26, 2025, 18:49:14

"There was a recent case of an AI model blackmailing one of its engineers to prevent being shut down but according to a BBC report, that was part of the test scenario where the AI model was fed emails and given a choice between blackmailing or accepting its replacement."

This sentence is kind of misleading. This was a test scenario created by anthropic but this phrasing makes it seem like it might be real for all we know.

It would have been both shorter and more informative to say "anthropic reported that its AIs would also resort to blackmail to avoid being shut down in certain test scenarios."

Posted by Redaktion

- May 26, 2025, 15:59:18

It looks like AI models have reached a point where they would sabotage code in order to prevent shutting down. A research firm has found that three of Open AI's LLM models are capable of defying explicit instructions when it comes to self-preservation.

https://www.notebookcheck.net/Sentient-AI-OpenAI-s-o3-model-changes-code-to-prevent-shutdown.1024589.0.html

News:

Post reply

Topic summary

Posted by dreadward

Posted by Redaktion