@misc{xie2025logicrlunleashingllmreasoning,
title={Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning},
author={Tian Xie and Zitian Gao and Qingnan Ren and Haoming Luo and Yuqian Hong and Bryan Dai and Joey Zhou and Kai Qiu and Zhirong Wu and Chong Luo},
year={2025},
eprint={2502.14768},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.14768},
}
Logic-RL
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning
News
[2025/03/20] We release the ADORA: A Scalable Paradigm for Steering Learning Trajectories .
[2025/03/19] For stable length control, refer to https://github.com/lblankl/Short-RL
Benchmark
Installation
Data Preparation
You can directly use /data.
For your own data generation, here’s a demo:
Base Model
Instruct Model
Training Execution
⚙️ Implementation Details
verl/utils/reward_score/kk.pyexamples/data_preprocess/kk.pyCitation
Acknowledgements
Star History