ClawBench — Browser Agent Benchmark Suite Benchmark dataset (V1+V2), live leaderboard Space, and full V1 execution traces — everything you need to run, regrade, or compare on ClawBench. NAIL-Group/ClawBench Viewer • Updated about 8 hours ago • 153 • 382 • 2 Running Agents ClawBench Leaderboard 🦀 Live leaderboard for the ClawBench web-agent benchmark NAIL-Group/ClawBenchV1Trace Updated about 8 hours ago • 5.78k ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published Apr 9 • 263
ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published Apr 9 • 263
ClawBench — Browser Agent Benchmark Suite Benchmark dataset (V1+V2), live leaderboard Space, and full V1 execution traces — everything you need to run, regrade, or compare on ClawBench. NAIL-Group/ClawBench Viewer • Updated about 8 hours ago • 153 • 382 • 2 Running Agents ClawBench Leaderboard 🦀 Live leaderboard for the ClawBench web-agent benchmark NAIL-Group/ClawBenchV1Trace Updated about 8 hours ago • 5.78k ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published Apr 9 • 263
ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published Apr 9 • 263
pinned Running Agents ClawBench Leaderboard 🦀 Live leaderboard for the ClawBench web-agent benchmark