MiniMax @MiniMax_AI Responds at Length to "The Model Cannot Say Ma Jiaqi"


MiniMax's official WeChat account published a detailed response regarding the M2 series model's inability to say Ma Jiaqi, providing a complete troubleshooting process and technical considerations for the "Jiaqi recognition" issue. ⬇️
MiniMax stated that they conducted investigations from multiple dimensions, including tokenizer version alignment, embedding statistical distribution, semantic neighbor retrieval, few-shot experiments comparing pre-training and post-training models, frequency statistics of post-training data, and sorting scans of the magnitude of changes in the full vocabulary lm_head.
The final identified reason was: "'Jiaqi' was merged into a single #token in the tokenizer, but this token appeared very infrequently in the post-training data, causing the model to gradually forget its ability to generate this token during post-training."
View Original
post-image
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin