PrismML launches 1.58-bit model Ternary Bonsai, cutting parameters by 9 times while its intelligence surpasses that of similar models
PrismML releases the Ternary Bonsai series, using 1.58-bit weights {-1, 0, +1}, with GPU memory only one-ninth of a 16-bit model. The 8B/4B/1.7B sizes are open-sourced on Hugging Face and natively run on Apple devices. The 8B weights are approximately 1.75 GB, with a benchmark score of 75.5, leading among peers. On the iPhone 17 Pro Max, the 8B model runs at 27 tokens/sec, with a 3–4 times improvement in energy efficiency. The weights are distributed under Apache 2.0 and run natively on Apple devices via the MLX framework.