| | --- |
| | license: apache-2.0 |
| | base_model: |
| | - seeklhy/OmniSQL-32B |
| | --- |
| | |
| |
|
| | ## Important Links |
| |
|
| | [](https://arxiv.org/abs/2509.24403) |
| | [](https://github.com/antgroup/Agentar-Scale-SQL) |
| | [](https://bird-bench.github.io/) |
| | [](https://huggingface.co/collections/antgroup/agentar-scale-sql) |
| | [](https://modelscope.cn/collections/Agentar-Scale-SQL-0c368e98f73f41) |
| |
|
| | ## Introduction |
| |
|
| | We are excited to release the **Agentar-Scale-SQL-Generation-32B**, the core **Reasoning SQL Generator** used in our SOTA framework, **Agentar-Scale-SQL**. Our framework achieved **81.67% execution accuracy** on the challenging BIRD benchmark, ranking first on the official leaderboard. |
| |
|
| | This model is a key component of our "Orchestrated Test-Time Scaling" strategy and has several key features: |
| |
|
| | - **Base Model:** It is fine-tuned from `Omni-SQL-32B`. |
| | - **RL-Enhanced Reasoning:** The model was further trained using an execution-grounded **Reinforcement Learning** framework (GRPO) to enhance its intrinsic reasoning capabilities. |
| | - **Deep Reasoning:** It is engineered to conduct deep, step-by-step reasoning and construct complex, high-accuracy SQL queries. |
| |
|
| | This model is one of the two main generators in the `Agentar-Scale-SQL` framework's "Diverse Synthesis" step, working in parallel with an ICL generator to produce a robust pool of SQL candidates. |
| |
|
| | ## Model Downloads |
| |
|
| | | **Model** | **Role** | |
| | |-----------------------------------|----------------| |
| | | **Agentar-Scale-SQL-Generation-32B** | **SQL Generator** | |
| | | Agentar-Scale-SQL-Selection-32B | SQL Selector | |
| |
|
| | ## Performance |
| |
|
| | The performance metrics below reflect the **entire Agentar-Scale-SQL framework**, which uses this Generation model as a key component. The results demonstrate our SOTA performance on the BIRD benchmark. |
| |
|
| | | Methods | EX (Dev) | **EX (Test)** | R-VES (%) | |
| | |:-----------------------------|:---:|:---:|:---------:| |
| | | **Agentar-Scale-SQL (Ours)** | **74.90** | **81.67** | **77.00** | |
| | | AskData + GPT-4o | 76.14 | 80.88 | 76.24 | |
| | | LongData-SQL | 74.32 | 77.53 | 71.89 | |
| | | CHASE-SQL + Gemini | 74.90 | 76.02 | 69.94 | |
| | | JoyDataAgent-SQL | 74.25 | 75.74 | 70.16 | |
| | | TCDataAgent-SQL | 74.12 | 75.74 | - | |
| | | Contextual-SQL | 73.50 | 75.63 | 70.02 | |
| | | XiYan-SQL | 73.34 | 75.63 | 71.41 | |
| |
|
| |
|
| | ## Prompt Template |
| |
|
| | ````python |
| | PROMPT_TEMPLATE = """Task Overview: |
| | You are a data science expert. Below, you are provided with a database schema and a natural language question. Your task is to understand the schema and generate a valid SQL query to answer the question. |
| | |
| | Database Engine: |
| | {{ dialect }} |
| | |
| | Database Schema: |
| | {{ db_schemas }} |
| | This schema describes the database's structure, including tables, columns, primary keys, foreign keys, and any relevant relationships or constraints. |
| | {% if matched_contents %} |
| | Matched contents: |
| | {{ matched_contents }} |
| | Matched contents presents values related to the question, together with their source table and column, for your reference in SQL generation. |
| | {% endif %} |
| | Question: |
| | {%- if hint %} |
| | {{ hint }} |
| | {{ question }} |
| | {%- else %} |
| | {{ question }} |
| | {%- endif %} |
| | |
| | Instructions: |
| | - If Matched contents is provided, you can use it as reference when generating the SQL query. |
| | - Make sure you only output the information that is asked in the question. If the question asks for a specific column, make sure to only include that column in the SELECT clause, nothing more. |
| | - The generated query should return all of the information asked in the question without any missing or extra information. |
| | - Before generating the final SQL query, please think through the steps of how to write the query. |
| | |
| | Output Format: |
| | In your answer, please enclose the generated SQL query in a code block: |
| | ```sql |
| | -- Your SQL query |
| | ``` |
| | |
| | Take a deep breath and think step by step to find the correct SQL query. |
| | """ |
| | ```` |
| |
|
| | ## Acknowledgments |
| |
|
| | If you find our work useful, please cite the Agentar-Scale-SQL paper: |
| |
|
| | ```bibtex |
| | @misc{wang2025agentarscalesqladvancingtexttosqlorchestrated, |
| | title={Agentar-Scale-SQL: Advancing Text-to-SQL through Orchestrated Test-Time Scaling}, |
| | author={Pengfei Wang and Baolin Sun and Xuemei Dong and Yaxun Dai and Hongwei Yuan and Mengdie Chu and Yingqi Gao and Xiang Qi and Peng Zhang and Ying Yan}, |
| | year={2025}, |
| | eprint={2509.24403}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL}, |
| | url={https://arxiv.org/abs/2509.24403}, |
| | } |
| | ``` |
| |
|
| |
|