The UK AI Safety Institute (AI Security Institute, AISI) released on May 1 an assessment report on the cyberattack capabilities of OpenAI GPT-5.5. It said GPT-5.5 achieved a success rate of 71.4% in Expert difficulty tests, and 68.6% for Anthropic Claude Mythos Preview; the gap falls within the statistical error range. GPT-5.5 is also the second AI system—after Mythos—that can autonomously complete AISI’s 32-step enterprise cyber intrusion full-simulation exercise, “The Last Ones.” AISI warned that the rapid progress in AI attack capabilities may be part of an “overall trend,” not a single breakthrough event.
Expert difficulty tests: 71.4% vs 68.6%、the gap is within error
AISI is an AI safety research organization under the UK Department for Science, Innovation and Technology. This round of testing is the latest assessment by AISI of the offensive cyber capabilities of frontier AI models. In the highest Expert difficulty questions, GPT-5.5 averaged a 71.4% success rate and Mythos Preview 68.6%. The difference between the two falls within the statistical error range, meaning the cyberattack capabilities of the current OpenAI and Anthropic flagship models are effectively on par.
“The Last Ones,” the 32-step simulated enterprise network intrusion test and AISI’s most challenging evaluation item, showed that GPT-5.5 autonomously completed 2 out of 10 attempts (without human intervention), while Mythos Preview completed 3 out of 10. In the past, only Mythos had completed this project; GPT-5.5 is the second model to meet the standard. In another test, GPT-5.5 solved a reverse engineering problem in about 10 minutes, while human security experts on average needed 12 hours.
Universal jailbreak:A universal jailbreak vector; red-team development in 6 hours can bypass all malicious query filters
During the tests, AISI researchers also found a “universal jailbreak” attack vector: across all tested categories of malicious network queries, this attack could induce GPT-5.5 to produce harmful content, including multi-round agentic conversational scenarios. AISI said that red-team experts developed this jailbreak in about 6 hours.
For OpenAI, the existence of this universal jailbreak means that even if GPT-5.5-Cyber is deployed under restricted access scenarios such as the trusted access program, it could still be bypassed by technically skilled adversaries. OpenAI has already disclosed cyber security-related assessments in its GPT-5.5 system card, but AISI’s independent third-party evaluation provides a more credible peer benchmark.
Next observations: AISI’s timeline for the next round of assessments, and OpenAI’s response to the jailbreak
The next point to watch is AISI’s timeline for evaluating the next round of frontier models after Mythos and GPT-5.5, as well as whether OpenAI will release targeted updates in May in response to this disclosed universal jailbreak. In the conclusion of its report, AISI explicitly states, “If aggressive cyberattack capabilities are a byproduct of broader reasoning, coding, and self-directed task improvements, then subsequent progress may come at a faster pace”—suggesting that in the coming months, additional frontier models may enter the “Mythos-level” threshold.
This article AISI assessment: GPT-5.5 cyberattack capability is on par with Anthropic Mythos first appeared on Chain News ABMedia.
Related News
Pentagon signs a classified military network deployment contract with 7 major AI companies: Anthropic still excluded
Pentagon Chief Technology Officer: Anthropic still on the blacklist, Mythos exception handling
When you ask Claude about life’s biggest matters: relationship issues 25%, spirituality 38% flattery rate
OpenAI Releases GPT-5.5-Cyber: Battles Anthropic Mythos
BioMysteryBench: Mythos expert untangles unsolved questions 29.6%