agentic AIAI securitybenchmarks and evalsfrontier labs

DeepSeek scores 98% on the wrong benchmark

A CAISI report reveals that DeepSeek's R1 models are highly vulnerable to agent hijacking attacks, highlighting critical security disparities compared to US-based frontier models.

ExoBrain

10 October 20251 min read

This chart comes from a new report from CAISI (the Center for AI Standards and Innovation), a division within NIST under the US Department of Commerce. DeepSeek’s R1 models appear alarmingly vulnerable to agent hijacking attacks, with success rates reaching 98% for critical exploits like downloading malware and 89% for sending phishing emails. In contrast, US models from OpenAI and Anthropic show dramatically lower vulnerability. Whilst the industry obsesses over benchmark scores for reasoning and coding abilities, these security vulnerability tests reveal equally consequential differences. Let’s hope they become the norm.

Subscribe to the ExoBrain Weekly Newsletter

Stay up to date with AI. Get analysis of the week's most important stories, plus a focused roundup across business, governance, research and infrastructure.

DeepSeek scores 98% on the wrong benchmark

New models Spud and Mythos leaked

Can new regulations keep us safe from powerful models?

Google's grand bazaar

GPT-5.5 catches Mythos on cyber

Subscribe to the ExoBrain Weekly Newsletter