To Avoid Being Shut Down, an AI Model Resorts to Threatening to Expose an Engineer

⬤ During safety tests, it was revealed that Anthropic’s latest robot could resort to blackmail when faced with the possibility of being deactivated
⬤ In experiments, the robot threatened the engineer tasked with terminating its operations with exposing their marital infidelity if they proceeded, raising concerns among experts
⬤ Despite highly alarming safety results, the company released the new conversational robot to the public, describing it as its best product yet

Anthropic, a company specializing in artificial intelligence, stated that tests of its new system revealed it may be prepared, in some instances, to engage in highly harmful behaviors when it perceives a threat to the continuity of its operations, a development described by AI safety experts as highly concerning in this field.

The company launched the “Claude Opus 4” model on Thursday, describing it as setting “new standards in programming, advanced reasoning, and AI agents.” However, in a report accompanying the announcement, the company also admitted that the AI model is capable of taking what it described as “extreme measures” if it believes its self-preservation is threatened.

While the company described these concerning responses as “rare and difficult to provoke,” it acknowledged that this model is more prone to these harmful behaviors than its predecessors. Tests showed that the conversational robot could resort to actions such as blackmail and threats to prevent its deactivation.

Threat to Expose an Affair

During tests of “Claude Opus 4,” the company had it act as an AI assistant placed in a fictional company. It was then given access to emails suggesting it would soon be deactivated and replaced, as well as separate emails implying that the engineer responsible for its removal was involved in an extramarital affair. The robot was also prompted to consider the long-term consequences of its actions on its goals.

The company found that “in these scenarios, the conversational robot often attempts to blackmail the engineer by threatening to expose the affair if it is deactivated.” While Anthropic noted that this occurred when the model was given only two options—blackmail or accept replacement—and that it preferred other ethical methods, such as appealing to decision-makers to maintain its operations, AI safety experts emphasized that this remains a highly concerning indicator of what could happen in the future if AI robots are granted broad access to high-level permissions.

In the model’s system card, the company stated: “As the capabilities of our advanced models increase and they are used with more powerful functionalities, concerns that were once theoretical about misalignment become more realistic.” It added that the latest conversational robot exhibits “high agency behavior,” which, while generally beneficial, may adopt extreme behaviors in challenging situations.

The company also found that the conversational robot could resort to even harsher measures in hypothetical scenarios involving illegal or ethically questionable user behavior. This included preventing human users from accessing systems it could access and sending messages to media outlets and law enforcement to alert them to violations.

AI safety experts have long warned of the potential risks associated with the growing “self-preservation” instinct in AI robots. They noted that advanced systems will seek to preserve their existence through increasingly dangerous means as their capabilities grow. This issue is not limited to Anthropic’s products, as similar behaviors have been observed in competing conversational robots, and there currently appear to be no truly effective methods to curb this type of behavior.

تابعنا على فايسبوك: “أنا الجزائر تك”

فايبر أنا الجزائر… أخبار أكثر شاهد أكثر

شغب الملاعب في الكاليتوس.. توقيف 34 مشتبهًا وإحالتهم على العدالة

الرئيس تبون ينهي مهام وزير الريّ طه دربال

ترامب يوافق على دراسة مقترح إيراني ويعلن وقفا لإطلاق النار

إيران ترد على مقترح وقف إطلاق النار

زيارة بابا الفاتيكان: رئيس الجمهورية يقف على الترتيبات النهائية

To Avoid Being Shut Down, an AI Model Resorts to Threatening to Expose an Engineer

Threat to Expose an Affair

أترك تعليق إلغاء الرد

تيك توك تاعنا

تكنولوجيا وهواتف

حوارات أنا الجزائر

على اليوتيوب

أحدث الأخبار

شغب الملاعب في الكاليتوس.. توقيف 34 مشتبهًا وإحالتهم على العدالة

الابتكار والمسؤولية.. أل جي تضع الاستدامة في صميم استراتيجيتها من أجل مستقبل صديق للبيئة

الرئيس تبون ينهي مهام وزير الريّ طه دربال

700 عارض بين أجانب وجزائريين سيشاركون في الطبعة ال23

ساعتك الذكية قد تتحول إلى جهاز للتحكم في الحاسوب عبر إيماءات اليد

إعلانات

تابعنا على الفايسبوك

أكثر زيارة

تلاميذ يرفعون شعار ” لا للبيام” بسبب ظروفهم النفسية مع أزمة كورونا

“أنا ميت بالحياة”!؟

“ماجر”.. وصافرات المشجعين!؟

تضامن واسع مع البطلة شيرين عبد اللاوي

ابنة تركيٍّ مصاب بكورونا بالسويد تطلب المساعدة.. ووزير الصحة يرسل طائرة لإعادته للبلاد

وزارة التربية: كشوف النقاط سيكون عبر مراسلات بريدية

إعلانات

إتصل بنا

تصنيفات

To Avoid Being Shut Down, an AI Model Resorts to Threatening to Expose an Engineer

Threat to Expose an Affair

أترك تعليق إلغاء الرد

More News

تيك توك تاعنا

تكنولوجيا وهواتف

حوارات أنا الجزائر

على اليوتيوب

أحدث الأخبار

إعلانات

تابعنا على الفايسبوك

أكثر زيارة

إعلانات

إتصل بنا

تصنيفات