Nemotron-Labs-Diffusion Collection A Tri-Mode Language Model Family Unifying Autoregressive, Diffusion, and Self-Speculation Decoding • 7 items • Updated about 3 hours ago • 47
Proven REAPs Collection Benchmarked REAP checkpoints with >=500 all-time downloads. GLM/Qwen/MiniMax/DeepSeek/Kimi/gemma. • 20 items • Updated 2 days ago • 10
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 19 items • Updated about 5 hours ago • 303
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 12 items • Updated 4 days ago • 151
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano and Super v3. • 32 items • Updated about 2 hours ago • 142
view article Article The Transformers Library: standardizing model definitions +2 lysandre, ArthurZ, pcuenq, julien-c • May 15, 2025 • 123
view article Article Welcome Falcon Mamba: The first strong attention-free 7B model +4 JingweiZuo, yellowvm, DhiyaEddine, IChahed, ybelkada, Gkunsch • Aug 12, 2024 • 113