According to Beating, MIT Kaiming He’s team recently released ELF (Embedded Language Flows), a language diffusion model that departs from the autoregressive “predict next token” approach used by GPT-style models. Instead, ELF performs text generation in a continuous embedding space, converting to discrete tokens only in the final step.
In OpenWebText unconditional generation benchmarks, the 105M-parameter ELF-B achieved approximately 24.1 generation perplexity (Gen. PPL) with 32-step sampling, outperforming multiple discrete and continuous diffusion language model baselines. Notably, ELF-B required only approximately 45 billion training tokens, roughly one order of magnitude fewer than comparable methods which typically exceed 500 billion tokens.
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to
Disclaimer.
Related Articles
Anthropic in Talks to Acquire Developer Tools Startup Stainless for at Least $300M
According to The Information, Anthropic is in advanced talks to acquire developer tools startup Stainless for at least $300 million. Stainless' developer tools have been adopted by OpenAI and Google.
GateNews7m ago
Andrew Ng: “AI won’t trigger a mass wave of job losses,” while software engineering hiring remains strong
Well-known AI scholar and DeepLearning.AI founder Andrew Ng (吳恩達) posted on X and in The Batch newsletter on May 12, arguing that “AI will not trigger a jobpocalypse,” directly rebutting the prevailing narrative that AI will lead to mass job losses. Based on Andrew Ng’s original post, it received more than 2,600 likes and was one of the most talked-about viewpoints in the AI field that week. Ng’s core argument: software engineering hiring remains strong, unemployment stays at 4.3% Ng used three
ChainNewsAbmedia1h ago
Baidu's Kunlun Chip Tian Chi 256-Card Supernode to Launch in June with 25% Throughput Improvement
According to Baidu, on May 13 during the Create 2026 developer conference, the company announced that its Kunlun Chip Tian Chi 256-card supernode will officially launch in June, with throughput performance improved
GateNews1h ago
Cerebras Prices IPO Above $150-160 Range, Raises $4.8B on Massive Demand
According to Bloomberg, Cerebras Systems is set to price its IPO above the US$150-160 range on May 13, 2026, with demand for the share sale surging more than 20 times over available shares. The AI chipmaker is offering 30 million shares and would raise US$4.8 billion at the top of the range,
GateNews1h ago
Meta Offers Rival AI Chatbots One Month Free WhatsApp Access to Avoid EU Antitrust Fine
According to Reuters, Meta offered rival AI chatbots in the European Economic Area (EEA) one month of free WhatsApp Business API access as part of efforts to settle an EU antitrust probe that could result in fines up to 10% of annual global turnover. The European Commission, which signaled in
GateNews1h ago
Xero Launches Claude Integration on May 13
According to Xero, the company launched a live integration with Anthropic's Claude on May 13 that lets subscribers worldwide use Xero data inside Claude.ai. The feature displays live figures such as cash position, overdue invoices, revenue, and receivables while linking responses back to Xero
GateNews1h ago