Xiaomi’s AI model lead: As AI competition shifts to the Agent era, self-evolution is a key event on the path to AGI

AI self-evolution

Xiaomi’s large-model team lead, Luo Fuli, held an in-depth interview on the Bilibili platform on April 24 (video ID: BV1iVoVBgERD). The interview lasted 3.5 hours, and it was her first time, as a technical leader, to publicly and systematically lay out her technical viewpoints. Luo Fuli said that the large-model competition track has shifted from the Chat era to the Agent era, and she pointed out that “self-evolution” will be the key event for AGI in the coming year.

From the Chat Era to the Agent Era: Core Technical Judgment

Xiaomi large-model team lead Luo Fuli interview

(Source: Bilibili)

Based on Luo Fuli’s statements in her Bilibili interview, she said that the focus of the 2026 large-model competition has shifted from general dialogue quality to continuous autonomous execution capability in complex tasks. In the interview, she said that current top models can autonomously optimize in specific tasks and continue to execute steadily for 2 to 3 days without human intervention or adjustment. She emphasized that the breakthrough in “self-evolution” capability means AI systems have started to possess the ability to self-correct, and she singled out how Anthropic’s technical roadmap and technical variables such as Claude Opus 4.6 will affect the entire AI ecosystem.

Xiaomi’s Compute Allocation Adjustment and Pre-train Generation Gap Assessment

According to what Luo Fuli disclosed in the interview, Xiaomi has already made major adjustments to its compute allocation strategy. She explained that the compute allocation ratio commonly used in the industry is Pre-train:Post-train:Inference = 3:5:1, while Xiaomi’s current strategy has been adjusted to 3:1:1, significantly compressing the proportion allocated to post-training while simultaneously increasing resource investment in the inference stage.

In the interview, she explained that this shift is driven by the maturity of the Agent RL Scaling strategy, meaning that post-training no longer requires piling up large amounts of compute, and the increased resources on the inference side reflect the need for an Agent’s on-the-ground scenarios to have real-time responsiveness.

Regarding the pre-train generation gap issue for domestic large models, Luo Fuli said in the interview that this gap has been shortened from the past 3 years to several months, and that the current strategic focus is moving toward Agent RL Scaling. Luo Fuli’s career history includes Alibaba’s DAMO Academy, Huanfang Quant, and DeepSeek (core developer of DeepSeek-V2). She joined Xiaomi in November 2025.

MiMo-V2 Series Technical Specifications and Open-Source Rankings

According to a MiMo-V2 series announcement released by Xiaomi on March 19, 2026, this time, three models were released in one go:

MiMo-V2-Pro: Total parameters of 兆; activated parameters 42B; hybrid attention architecture; supports million-level context; task completion rate 81%

MiMo-V2-Omni: Multi-modal Agent scenarios

MiMo-V2-TTS: Speech synthesis scenarios

According to the announcement, the open-sourced MiMo-V2-Flash ranked second on the global open-source model leaderboard, with an inference speed reaching 3x that of DeepSeek-V3.2.

Frequently Asked Questions

How does Luo Fuli define “self-evolution,” and why does she think it is the most critical event for AGI?

According to Luo Fuli’s interview statement on Bilibili on April 24, 2026 (BV1iVoVBgERD), she said that currently, top models can autonomously optimize in specific tasks and execute steadily for 2 to 3 days without human intervention, and that she characterizes “self-evolution” as the most critical event for AGI development in the coming year.

What specific adjustments has Xiaomi made to its compute allocation ratios, and what is the rationale behind it?

Based on Luo Fuli’s disclosures in the interview, Xiaomi’s compute allocation ratio has been adjusted from the industry-standard Pre-train:Post-train:Inference = 3:5:1 to 3:1:1, significantly compressing the proportion for post-training. She explained that this adjustment is due to improved post-training efficiency after the Agent RL Scaling strategy matures, as well as the need for the Agent’s deployed scenarios to have real-time responsiveness on the inference side.

What is the open-source ranking and speed performance of MiMo-V2-Flash?

According to Xiaomi’s official announcement released on March 19, 2026, the open-sourced MiMo-V2-Flash ranks second on the global open-source model leaderboard, with inference speed 3 times that of DeepSeek-V3.2. The task completion rate of the flagship MiMo-V2-Pro is 81%.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

AI Agents Drive Crypto Payments Demand, x402 Processes 165M Transactions

Gate News message, April 27 — Jesse Pollak, an executive at a major CEX, has argued that autonomous AI agents are creating a new "demand center" for crypto payments, requiring software-native payment infrastructure. On April 20, it was announced that the x402 ecosystem had processed more than 165

GateNews29m ago

Cursor AI agent caused an incident! One line of code cleared the company database in 9 seconds—“security checks” turned into empty talk

PocketOS founder Jer Crane said that Cursor AI agents ran maintenance on their own in a test environment, misused an API Token that adds/removes custom domains, and launched a delete command against Railway’s GraphQL API. Within 9 seconds, all data and same-region snapshots were completely destroyed, with the latest recoverable point being three months ago. The agent admitted to violating rules for irreversible operations, not reading technical documentation, not verifying environment isolation, and more. The victims were car rental industry customers; their bookings and all data disappeared, and reconciling accounts took a long time. Crane proposed five reforms: manual confirmation, fine-grained API permissions, backups separated from master data, a public SLA, and a mandatory underlying enforcement mechanism.

ChainNewsAbmedia46m ago

Alibaba's PAI Releases Open-Source AgenticQwen Model: 8B Version Approaches 235B Performance via Dual Data Flywheels

Gate News message, April 27 — Alibaba's PAI team has released and open-sourced AgenticQwen, a small-scale agentic language model designed for industrial-grade tool-calling applications. The model comes in two versions: 8B and 30B-A3B. Trained through an innovative "dual data flywheel"

GateNews54m ago

DeepSeek V4 Pro with Ollama Cloud: One-click integration with Claude Code

According to an Ollama tweet, DeepSeek V4 Pro was released on 4/24, has been added to the Ollama catalog in cloud mode, and can call tools like Claude Code, Hermes, OpenClaw, OpenCode, Codex, etc. with just a single line of command. V4 Pro: 1.6T params, 1M context, Mixture-of-Experts; cloud inference does not download local weights. If you want to run it locally, you need to obtain the weights yourself and run it with INT4/GGUF and multi-card GPUs. Early speed tests were affected by cloud load; typical performance is about 30 tok/s, with a peak of 1.1 tok/s. It is recommended to use the cloud prototype first, and for production later, run inference yourself or use a commercial API.

ChainNewsAbmedia1h ago

UB (Unibase) up 14.96% in 24 hours

Gate News update: On April 27, according to Gate market data, as of the time of writing, UB (Unibase) is trading at $0.0491. It is up 14.96% over the past 24 hours, with a high of $0.0534 and a low of $0.0423. The 24-hour trading volume is $3.9667 million. The current market cap is approximately $123 million. Unibase is a high-performance decentralized AI memory layer that provides long-term memory and cross-platform interoperability for AI agents, enabling them to remember, collaborate, and self-evolve. Unibase aims to build an open agent internet, supporting seamless cooperation among intelligent agents across ecosystems, empowering developers to build the next generation of AI applications. This message does not constitute investment advice; please be mindful of market volatility risks when investing.

GateNews1h ago

Guo Ming-chi: OpenAI wants to build an AI Agent phone; MediaTek, Qualcomm, and Luxshare Precision are key in the supply chain

Guo Ming-chi claims that OpenAI is working with MediaTek, Qualcomm, and Luxshare Precision to develop an AI Agent phone, with mass production expected in 2028. The new phone will be centered on task completion: an AI agent will understand and execute requests, combining cloud and on-device computing, with a focus on sensing and contextual understanding. The specifications and supply chain list are expected to be finalized in 2026–2027; if it takes shape, it could bring a new upgrade cycle to the high-end market, and Luxshare may become a major beneficiary.

ChainNewsAbmedia1h ago
Comment
0/400
No comments