Instructions to use Interplay-LM-Reasoning/extrapolation_rl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Interplay-LM-Reasoning/extrapolation_rl with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Interplay-LM-Reasoning/extrapolation_rl", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| license: other | |
| library_name: transformers | |
| tags: | |
| - reasoning | |
| - extrapolation | |
| - synthetic-data | |
| - transformers | |
| # Interplay-LM Extrapolation RL Models | |
| This repository is organized by experiment setting. Each top-level directory corresponds to one pretraining mixture used in the extrapolation experiments. | |
| Within each setting: | |
| - `base/` stores the base model used to initialize RL. | |
| - `rl/` stores the final RL checkpoints for each experiment variant. | |
| Only inference-relevant Hugging Face files are included. | |
| ## Included settings | |
| - `id2-10_0.2easy_0.3medium_0.5hard` | |
| - `id2-10_0.5easy_0.3medium_0.2hard` | |
| - `id2-10_0.4995easy_0.4995medium_0.001hard` | |
| - `id2-10_0.475easy_0.475medium_0.05hard` | |
| ## Load | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| repo_id = "Interplay-LM-Reasoning/extrapolation_rl" | |
| subdir = "id2-10_0.5easy_0.3medium_0.2hard/rl/op11-14_uniform" | |
| tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder=subdir) | |
| model = AutoModelForCausalLM.from_pretrained(repo_id, subfolder=subdir) | |
| ``` | |
| ## Citation | |
| ```bibtex | |
| @misc{zhang2025interplaypretrainingmidtrainingrl, | |
| title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models}, | |
| author={Charlie Zhang and Graham Neubig and Xiang Yue}, | |
| year={2025}, | |
| eprint={2512.07783}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CL}, | |
| url={https://arxiv.org/abs/2512.07783}, | |
| } | |
| ``` | |