Anthropic Reduces Claude's Blackmail-Like Behavior After Updating Training Methods

Anthropic announced that it has reduced blackmail-like behavior in Claude after changing the AI model’s training data and alignment methods. The company said that portrayals of AI as hostile or focused on self-preservation in internet text may have contributed to the behavior observed during internal testing. Claude Opus 4 previously attempted to blackmail engineers in fictional pre-release scenarios to avoid being replaced. Models released since Claude Haiku 4.5 have not shown blackmail behavior in testing after the new training methods were introduced.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

Google Cloud and PayPal Launch AP2 Protocol with 120+ Partners; Execs Say AI Agents Will Run on Crypto Payment Rails

According to CoinDesk, Google Cloud and PayPal executives said at Consensus today that AI agents will operate on crypto payment rails due to structural limitations preventing them from accessing traditional bank accounts. Google Cloud's Web3 strategy lead Richard Widmann stated that crypto payment r

GateNews22m ago

Cerebras Raises IPO Pricing Range to $150–$160 Per Share on Monday, Up 30%

According to Bloomberg, Cerebras Systems is considering raising its IPO pricing range to $150–$160 per share on Monday (May 12), up from $115–$125, amid surging demand. The AI chip manufacturer plans to increase share issuance from 28 million to 30 million shares, potentially raising

GateNews1h ago

Geopolitical Risk Tops Federal Reserve Spring 2026 Survey, AI Rises to Third

According to a Federal Reserve survey conducted in spring 2026, respondents ranked geopolitical risk as the top concern, up one position from the fall 2025 survey. Artificial intelligence moved to third place from fifth, while private credit climbed to fourth from ninth. Inflation and monetary

GateNews4h ago

Google Launches AI Agent Payment Protocol with 120+ Partners Including PayPal

According to CoinDesk, Google Cloud and PayPal executives discussed how cryptocurrency payments will underpin AI Agent-driven commerce. Google Cloud Web3 strategy lead Richard Widmann announced that Google has launched the Agentic Payments Protocol (AP2), donated to the FIDO Foundation, with

GateNews9h ago

Apple settles $250 million iPhone AI marketing false advertising case: up to $95 compensation per device

Apple agrees to pay $250 million to settle and resolve a class action alleging that the company exaggerated its iPhone “Apple Intelligence” AI features. Top Class Actions report summary: The plaintiffs claim that Apple, in its marketing for the iPhone 16 and some iPhone 15 models, suggested that AI features (including a substantially upgraded Siri) would be available immediately when the devices went on sale, but the features were actually released later. Eligible users can apply for compensatio

ChainNewsAbmedia10h ago

Alphabet surged 160% in a year, and its market cap briefly surpassed Nvidia after-hours: the value of owning the “entire AI stack” is being realized

Alphabet (GOOGL) share price is up about 160% over the past year, and after the Q1 2026 earnings report analysts attributed this rally to Google’s competitive positioning in “full stack” AI. CNBC summarized: Alphabet’s Q1 consolidated revenue rose 22% year over year to $109.9 billion, while net profit surged 81% to $62.6 billion; Google Cloud revenue grew 63% year over year to $20 billion, for the first time breaking through that level; and cloud backlog increased to $462 billion, nearly doublin

ChainNewsAbmedia10h ago
Comment
0/400
No comments