view article Article EMO: Pretraining mixture of experts for emergent modularity allenai • 8 days ago • 33
view article Article Training Design for Text-to-Image Models: Lessons from Ablations Photoroom • Feb 3 • 73
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding Paper • 2512.13586 • Published Dec 15, 2025 • 93
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 lysandre, ArthurZ, cyrilvallez, reach-vb • Dec 1, 2025 • 311