TIGER-Lab/MMLU-Pro
Benchmark • Updated • 12.1k • 150k • 470
Natural Language Processing, Image Generation
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time