NVIDIA releases Nemotron 3 Nano Omni open-source multimodal model

ChainNewsAbmedia

2026-05-07 10:56:27

According to an April 28 announcement from the official NVIDIA blog (author Kari Briski), NVIDIA released Nemotron 3 Nano Omni — an open-source multimodal model that integrates visual, speech, and language capabilities into a single model, aiming to provide an “perception layer” for AI agent systems with lower latency and lower cost.

Key specifications: 30B-A3B MoE, 256K context, 9x throughput, topped 6 leaderboard rankings

Key architecture:

30B-A3B hybrid mixture-of-experts (total parameters 30B, activated 3B)

Integrates Conv3D and EVS encoding

256K context length

Inputs: text, images, audio, video, documents, charts, GUI screen

Outputs: text

Performance signals: 9x throughput under the same level of interaction compared with other open-source omni models; ranked first on 6 benchmark leaderboard rankings across three categories—document intelligence, video understanding, and audio understanding (NVIDIA’s announcement does not list specific scores, guiding readers to the developer blog for details).

NVIDIA positions Nemotron 3 Nano Omni as the “eyes and ears” in agent systems. It can分業 with other family models such as Nemotron 3 Super (high-frequency execution) and Nemotron 3 Ultra (complex planning), and it can also interoperate with third-party cloud models. Three typical agent application scenarios:

Computer Use Agent: native visual reasoning at 1920×1080 resolution

Document intelligence: cross-layout, tables, screenshots, and mixed-media input reasoning

Audio/video understanding: integrate speech, scenes, and recordings into a single reasoning chain

Adopting lineup: Hon Hai, Palantir join; H Company CEO makes a named statement

NVIDIA’s announcement clearly distinguishes “production adoption” from “under evaluation”:

Production adoption: Aible, Applied Scientific Intelligence (ASI), Eka Care, Hon Hai (Foxconn), H Company, Palantir, Pyler

Under evaluation: Amdocs, Dell, Docusign, Infosys, IQVIA, Lila, Oracle, Quantiphi, TCS, Zefr, etc.

H Company CEO Gautier Cloix makes a named statement in the announcement: “To build useful agents, you can’t wait seconds for a model to interpret a screen. By building on Nemotron 3 Nano Omni, our agents can rapidly interpret full HD screen recordings — something that wasn’t practical before.” Translation: “To build useful agents, you can’t wait seconds for a model to interpret a screen. By building on Nemotron 3 Nano Omni, our agents can rapidly interpret full HD screen recordings — something that wasn’t practical before.”

Open-source strategy and deployment: weights / datasets / training methods all made public

At the time of release, NVIDIA also published:

Model weights

Training datasets

Training techniques/methodology

Deployment pipeline covers three layers:

Local workstation: NVIDIA DGX Spark, DGX Station

NIM microservices: build.nvidia.com

Third-party platforms: Hugging Face, OpenRouter, and through 25+ NVIDIA Cloud Partners, inference platforms, and cloud service providers

Custom tools use NVIDIA NeMo. Over the past year, the Nemotron 3 family (Nano/Super/Ultra) has accumulated more than 50 million downloads on Hugging Face. This time, the Omni extends the family’s capabilities into the multimodal and agentic domains.

This article NVIDIA’s release of Nemotron 3 Nano Omni open-source multimodal first appeared on 鏈新聞 ABMedia.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.