According to Beating, Sapient Intelligence open-sourced HRM-Text, a 1-billion-parameter text generation model based on its hierarchical reasoning model (HRM) architecture. Using just 40 billion structured tokens, the model requires only 46 hours of training on two 8-GPU H100 servers, with a compute cost of approximately $1,472 for the 1B version and $800 for the 0.6B variant; this represents a 130–600 fold reduction in pretraining compute compared to standard models.
The efficiency gains come from a dual-timescale recurrent design with separate fast and slow Transformer modules that alternate over the same input and exchange information via state addition. The complete engineering framework, including data extraction and PyTorch distributed training, has also been open-sourced. Note that the released weights are unaligned pretraining-only; the model supports prefix completion tasks but cannot function as a conversational assistant.
Related News