Lean4Agent: Formal Modeling and Verification for Agent Workflow and Trajectory Paper • 2606.06523 • Published 7 days ago • 1
Future-KL Regularized GRPO: Process-Level Credit Assignment from $f$-Divergence Regularization Paper • 2601.10201 • Published 17 days ago • 10
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models Paper • 2310.13671 • Published Oct 20, 2023 • 19