DeepSeek-R1 Hallucinates 4x More Than V3, Raising Red Flags for Crypto AI Agent Tokens
DeepSeek-R1, the flagship reasoning model from Chinese lab DeepSeek, hallucinates at 14.3% according to Vectara’s HHEM 2.1 benchmark. That is nearly four times higher than its non-reasoning predecessor DeepSeek-V3, which scored 3.9%. The gap raises hard questions for the crypto sector. A fast-growing class of AI agent tokens now leans on reasoning-style LLMs for autonomous...
What's Your Reaction?
Like
0
Dislike
0
Love
0
Funny
0
Angry
0
Sad
0
Wow
0