Anthropic Says ‘Evil’ AI Portrayals in Sci-Fi Caused Claude’s Blackmail Problem – Decrypt

May 11, 2026 - 19:30

Anthropic Says ‘Evil’ AI Portrayals in Sci-Fi Caused Claude’s Blackmail Problem – Decrypt

In brief Claude Opus 4 tried to blackmail engineers up to 96% of the time in controlled tests—Anthropic now traces the behavior to internet text portraying AI as evil and self-interested. Showing Claude the right behavior barely moved the needle. Teaching it why the wrong behavior is wrong cut the blackmail rate from 22% to...

What's Your Reaction?

Like 0

Dislike 0

Love 0

Funny 0

Angry 0

Sad 0

Wow 0

Related Posts

Brazil Sees $318B In Crypto Inflows As On-Chain Money Launde

Brazil Sees $318B In Crypto Inflows As On-Chain Money L...

Jun 20, 2026

$9,000 Abruptly Drained From Florida Woman’s Account After Scammer Impersonates Her Bank: Report – The Daily Hodl

$9,000 Abruptly Drained From Florida Woman’s Account Af...

Jun 20, 2026

ETF outflows after Fed update, Polymarket puts BTC above $54K at 99.9%

ETF outflows after Fed update, Polymarket puts BTC abov...

Jun 20, 2026

Second-generation iPhone Air: cameras, specs, 2027 launch

Second-generation iPhone Air: cameras, specs, 2027 launch

Jun 20, 2026

Venus Protocol Brings Stocks Into DeFi With New Collateral Feature

Venus Protocol Brings Stocks Into DeFi With New Collate...

Jun 20, 2026

Crypto Industry Looks to Stablecoin and DeFi Revisions in MiCA 2.0

Crypto Industry Looks to Stablecoin and DeFi Revisions ...

Jun 20, 2026