Tilde Research Discovers Muon Optimizer Kills 25% of Neurons; Aurora Alternative Achieves 100x Data Efficiency Gain

According to Tilde Research, the Muon optimizer adopted by leading AI models including DeepSeek V4 and Kimi K2.5 has a hidden flaw: it causes over 25% of MLP layer neurons to permanently die during early training. The team designed Aurora, an alternative optimizer, and open-sourced it. A 1.1B parameter model trained with only 100B tokens matched the performance of Qwen3-1.7B trained on 36T tokens across language understanding benchmarks like HellaSwag and Winogrande, demonstrating roughly 100x data efficiency improvement. Aurora adds 6% computational overhead compared to Muon and can serve as a direct replacement.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

Nvidia Commits Over $40 Billion to AI Investments in Early 2026, Including $30 Billion to OpenAI

According to TechCrunch, Nvidia committed over $40 billion to equity investments in AI companies in the first months of 2026, with a $30 billion investment in OpenAI as the largest single commitment. The chipmaker also pledged up to $3.2 billion in glassmaker Corning and as much as $2.1 billion to d

GateNews3h ago

NVIDIA’s open AI long-term partner Deepinfra raises $107 million Series B funding to build a “token factory”

AI startup DeepInfra announced it has completed a $107 million Series B funding round, led by 500 Global and early Google engineer Georges Harik, with strategic investors including NVIDIA (Nvidia), Samsung Next, and Supermicro participating. According to official information, this capital injection will be used to expand global data center capacity, addressing the computational cost and efficiency bottlenecks faced when current AI applications shift from “model training” to “large-scale inferenc

ChainNewsAbmedia4h ago

ECB Governing Council Member Escrivá Flags AI Risks to Financial Infrastructure on May 9

ECB Governing Council member Escrivá stated on May 9 that central banks must reassess the resilience of financial infrastructure and cybersecurity robustness in light of artificial intelligence developments. According to his remarks at an event, recent AI advances compel a reevaluation of financial

GateNews4h ago

SpaceX Rebrands xAI to SpaceXAI, Files Orbital Computing Trademark Ahead of $1.75T IPO

According to trademark filings with the United States Patent and Trademark Office, Elon Musk's artificial intelligence company xAI is being folded into SpaceX under a new brand, SpaceXAI. The rebrand encompasses satellite-based data centers, orbital computing, cloud computing, and AI workload

GateNews6h ago

NVIDIA Space Computing ecosystem chain launches; Space-1 Vera Rubin delivers data-center-grade AI computing power to space

NVIDIA Space Computing was introduced at GTC 2026. Recently, NVIDIA officially released more information, aiming to move its accelerated computing platform from ground data centers to space orbits. The project focuses on AI infrastructure required for next-generation space missions, enabling satellites, orbital platforms, and ground stations to use NVIDIA GPUs and edge computing modules to speed up the processing of images, sensor data, and geospatial intelligence. (NVIDIA GTC 2026|NVIDIA sends

ChainNewsAbmedia11h ago

Chrome Automatically Downloads Multi-Gigabyte Gemini Nano AI Model on May 9, Sparks Crypto Community Security Concerns

According to BlockBeats, on May 9, Chrome automatically downloaded a multi-gigabyte AI model file (Gemini Nano) to users' devices without explicit consent for local fraud detection, webpage summarization, and AI features. While Google stated that local AI execution enhances privacy and security,

GateNews12h ago
Comment
0/400
No comments