Gate News message, April 24 — DeepSeek has released the V4 series of open-source models under the MIT License, with weights now available on Hugging Face and ModelScope. The series includes two mixture-of-experts (MoE) models: V4-Pro with 1.6 trillion total parameters and 49 billion activated per token, and V4-Flash with 284 billion total parameters and 13 billion activated per token. Both support a 1 million token context window.

The architecture features three key upgrades: a hybrid attention mechanism combining compressed sparse attention (CSA) and heavily compressed attention (HCA) that significantly reduces long-context overhead—V4-Pro’s inference FLOPs for 1M context is just 27% of V3.2’s, and KV cache (VRAM for storing historical information during inference) is only 10% of V3.2’s; manifold-constrained hyperconnections (mHC) replacing traditional residual connections to enhance cross-layer signal propagation stability; and the Muon optimizer for faster training convergence. Pre-training used over 32 trillion tokens of data.

Post-training employs a two-stage approach: first training domain-specific experts via supervised fine-tuning (SFT) and GRPO reinforcement learning, then merging them into a single model through online distillation. V4-Pro-Max (highest inference mode) claims to be the strongest open-source model with top-tier coding benchmarks and significantly narrowed gaps with closed-source frontier models on reasoning and agent tasks. V4-Flash-Max achieves Pro-level reasoning performance with sufficient compute budget but is limited by parameter scale on pure knowledge and complex agent tasks. Weights are stored in mixed FP4+FP8 precision.

View Source

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

NeoSoul Co-Founder Kaelan: AI Industry Should Allow Toys to Exist, Innovation Often Starts as Experimental Products

AI Industry News

Gate News message, April 24 — At a recent Hong Kong forum on intelligent encrypted finance, NeoSoul co-founder Kaelan shared insights on evaluating AI projects in the early-stage, rapidly evolving AI industry. Beyond assessing current products, teams must demonstrate the ability to keep pace with un

GateNews23m ago

Meta and Amazon Agree on Multi-Billion Dollar Deal to Supply Graviton Chips for AI Development

Stocks AI Industry News

Gate News message, April 24 — Meta Platforms and Amazon Web Services (AWS) have reached a multi-billion dollar agreement to support Meta's artificial intelligence initiatives over the coming years, according to the Wall Street Journal. Under the deal, Meta will use tens of millions of AWS Graviton c

GateNews35m ago

DeepSeek V4-Flash goes live on Ollama Cloud, US-hosted: Claude Code, OpenClaw one-click integration

AI Industry News AI Tools & Apps

Ollama Cloud has launched DeepSeek V4-Flash, with inference hosted on U.S. servers, providing three sets of one-click commands to connect Claude Code, OpenClaw, and Hermes. V4-Flash/V4-Pro use a MoE architecture, with native support for 1M context, and reduce costs with Token-wise compression + DSA sparse attention. In a 1M scenario, token FLOPs per token drop by 27%, and KV cache drops by 10%. API-compatible with OpenAI ChatCompletions and Anthropic, making it easy to switch between multiple workflows and lowering costs and data-sovereignty risk.

ChainNewsAbmedia2h ago

Web3 AI Infrastructure AIW3 Raises $2M in Seed Funding Led by Buffalo Capital

AI Agent AI Industry News

Gate News message, April 24 — Web3 AI infrastructure platform AIW3 announced the completion of a $2 million seed round funding. The round was led by Buffalo Capital, with GalaXin Capital and Three-stones Ventures participating as co-investors. AIW3 is transitioning toward an Agent-as-a-Service

GateNews2h ago

Cohere Acquires German AI Firm Aleph Alpha, Secures $600M Investment for European Expansion

AI Industry News

Gate News message, April 24 — Canadian AI company Cohere announced plans to acquire German AI firm Aleph Alpha to strengthen its presence in Europe. Schwarz Group, a backer of Aleph Alpha, plans to invest $600 million in Cohere's Series E funding round. The funding round is expected to close in 202

GateNews3h ago

Xpeng, Xiaomi Lead In-Car AI Push at Beijing Auto Show

AI Industry News

Gate News message, April 24 — Chinese automakers showcased advanced in-car AI systems at the Beijing Auto Show on April 24, as the country accelerates its AI Plus strategy and seeks greater independence from foreign semiconductors. Xpeng demonstrated voice-controlled parking that allows drivers to

GateNews3h ago

Comment

0/400

No comments