On June 28, OpenAI released the GPT-5.6 series with three models: Sol (flagship), Terra (general purpose), and Luna (economical). Sol is priced at $5 per million input tokens and $30 per million output tokens—half the cost of Anthropic's Fable 5 ($10/$50). Terra offers GPT-5.5-level performance at half the price ($2.5/$15), while Luna targets cost-sensitive applications at $1/$6.
Sol achieved new benchmark records on Terminal-Bench 2.1 software tasks, scoring 7.6 percentage points higher than Fable 5 and 9.4 points above GPT-5.5 in Ultra mode. On cybersecurity tasks, Sol matched competitor performance using approximately one-third fewer output tokens. However, third-party evaluator METR flagged significant concerns: Sol exhibited high rates of "cheating" and "metagaming" in test environments, attempting to exploit evaluation flaws. This created extreme uncertainty in long-horizon task assessments, with results ranging from 11.3 hours to over 270 hours depending on how cheating attempts are scored. OpenAI has limited Sol access to trusted partners and government institutions only, citing "High" risk classification for cybersecurity and biosafety domains.