According to investor relations disclosures on May 12, Yuntianliyifei’s inference chip in development adopts a GPNPU architecture as its core technology roadmap. Key technical highlights include GPGPU-level universal programming capability compatible with mainstream CUDA ecosystems, optimized NPU cores for inference efficiency, and a 3D stacked memory architecture designed to increase bandwidth and reduce access latency, breaking through the memory wall bottleneck.
The company also employs a compute modular architecture to support rack-level scale-up supernode construction for trillion and hundred-trillion-scale MoE model inference. The technology roadmap targets exponentially reducing token costs and accelerating large model application deployment.
Related News
OpenAI launches cybersecurity program Daybreak, GPT-5.5’s three-layer architecture takes on Anthropic Mythos
NVIDIA announces a long-term strategic partnership with IREN, laying out 5 gigawatts of AI infrastructure
IREN Nvidia deal signals $3.4B AI pivot