AISI assessment: GPT-5.5’s network-attack capabilities are on par with Anthropic’s Mythos

ChainNewsAbmedia

2026-05-02 03:16:50

The UK AI Safety Institute (AI Security Institute, AISI) released on May 1 an assessment report on the cyberattack capabilities of OpenAI GPT-5.5. It said GPT-5.5 achieved a success rate of 71.4% in Expert difficulty tests, and 68.6% for Anthropic Claude Mythos Preview; the gap falls within the statistical error range. GPT-5.5 is also the second AI system—after Mythos—that can autonomously complete AISI’s 32-step enterprise cyber intrusion full-simulation exercise, “The Last Ones.” AISI warned that the rapid progress in AI attack capabilities may be part of an “overall trend,” not a single breakthrough event.

Expert difficulty tests: 71.4% vs 68.6%、the gap is within error

AISI is an AI safety research organization under the UK Department for Science, Innovation and Technology. This round of testing is the latest assessment by AISI of the offensive cyber capabilities of frontier AI models. In the highest Expert difficulty questions, GPT-5.5 averaged a 71.4% success rate and Mythos Preview 68.6%. The difference between the two falls within the statistical error range, meaning the cyberattack capabilities of the current OpenAI and Anthropic flagship models are effectively on par.

“The Last Ones,” the 32-step simulated enterprise network intrusion test and AISI’s most challenging evaluation item, showed that GPT-5.5 autonomously completed 2 out of 10 attempts (without human intervention), while Mythos Preview completed 3 out of 10. In the past, only Mythos had completed this project; GPT-5.5 is the second model to meet the standard. In another test, GPT-5.5 solved a reverse engineering problem in about 10 minutes, while human security experts on average needed 12 hours.

Universal jailbreak：A universal jailbreak vector; red-team development in 6 hours can bypass all malicious query filters

During the tests, AISI researchers also found a “universal jailbreak” attack vector: across all tested categories of malicious network queries, this attack could induce GPT-5.5 to produce harmful content, including multi-round agentic conversational scenarios. AISI said that red-team experts developed this jailbreak in about 6 hours.

For OpenAI, the existence of this universal jailbreak means that even if GPT-5.5-Cyber is deployed under restricted access scenarios such as the trusted access program, it could still be bypassed by technically skilled adversaries. OpenAI has already disclosed cyber security-related assessments in its GPT-5.5 system card, but AISI’s independent third-party evaluation provides a more credible peer benchmark.

Next observations: AISI’s timeline for the next round of assessments, and OpenAI’s response to the jailbreak

The next point to watch is AISI’s timeline for evaluating the next round of frontier models after Mythos and GPT-5.5, as well as whether OpenAI will release targeted updates in May in response to this disclosed universal jailbreak. In the conclusion of its report, AISI explicitly states, “If aggressive cyberattack capabilities are a byproduct of broader reasoning, coding, and self-directed task improvements, then subsequent progress may come at a faster pace”—suggesting that in the coming months, additional frontier models may enter the “Mythos-level” threshold.

This article AISI assessment: GPT-5.5 cyberattack capability is on par with Anthropic Mythos first appeared on Chain News ABMedia.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.