Google DeepMind released AI co-mathematician, a multi-agent math research assistant, achieving 47.9% accuracy on FrontierMath Tier 4 benchmark, surpassing GPT-5.5 Pro's previous record of 39.6% on May 9. The system solved 23 out of 48 problems, including 3 that all previous models failed to solve. Built on Gemini 3.1 Pro, the architecture uses a hierarchical design with a project coordinator agent distributing tasks to sub-agents handling literature retrieval, coding, and reasoning, with multiple reviewer agents validating proofs before submission.
Epoch AI conducted blind testing, preventing the DeepMind team from seeing problems, with each question allowed 48 hours of computation. In real-world application, mathematician Marc Lackenby used the system to resolve an open conjecture from the Kourovka Notebook, demonstrating its practical research value. The system is currently available to a limited number of mathematicians in beta testing.