The ToolRL model trained for tool use through GRPO
Cheng Qian
chengq9
AI & ML interests
Agent, Tool Learning
Recent Activity
upvoted a collection 17 days ago
AgentDoG upvoted a paper 2 months ago
JustRL: Scaling a 1.5B LLM with a Simple RL Recipe upvoted a paper 4 months ago
Multimodal Policy Internalization for Conversational Agents