Google DeepMind AI Co-Mathematician Hits 47.9% on FrontierMath Tier 4, Beats GPT-5.5 Pro, Solves 3 Previously Unsolvable Problems

Google DeepMind released AI co-mathematician, a multi-agent math research assistant, achieving 47.9% accuracy on FrontierMath Tier 4 benchmark, surpassing GPT-5.5 Pro's previous record of 39.6% on May 9. The system solved 23 out of 48 problems, including 3 that all previous models failed to solve. Built on Gemini 3.1 Pro, the architecture uses a hierarchical design with a project coordinator agent distributing tasks to sub-agents handling literature retrieval, coding, and reasoning, with multiple reviewer agents validating proofs before submission.

Epoch AI conducted blind testing, preventing the DeepMind team from seeing problems, with each question allowed 48 hours of computation. In real-world application, mathematician Marc Lackenby used the system to resolve an open conjecture from the Kourovka Notebook, demonstrating its practical research value. The system is currently available to a limited number of mathematicians in beta testing.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments