Post
485
Introducing Rain-v2: Democratizing LLM training on gaming GPUs! ⚡
Following Rain-100M, we’re scaling up. Rain-v2 features a larger training dataset.
We’ve published a comprehensive blog covering the end-to-end journey—from raw data collection to rigorous evaluation and safety testing.
HF Repo: 🤗 raincandy-u/Rain-v2
Blog: 📚
https://angelkawaii.xyz/2026/01/29/rain-v2/
Special thanks to the open-source community and the SmolLM2 team for their foundational work! 🚀
HuggingFaceTB
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model (2502.02737)
Following Rain-100M, we’re scaling up. Rain-v2 features a larger training dataset.
We’ve published a comprehensive blog covering the end-to-end journey—from raw data collection to rigorous evaluation and safety testing.
HF Repo: 🤗 raincandy-u/Rain-v2
Blog: 📚
https://angelkawaii.xyz/2026/01/29/rain-v2/
Special thanks to the open-source community and the SmolLM2 team for their foundational work! 🚀
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model (2502.02737)