目录
👋 Hi, everyone!
We are ByteDance Seed team.

seed logo

Flow-based Policy for Online Reinforcement Learning

We are delighted to introduce FlowRL. It is a new approach for online reinforcement learning that integrates flow-based policy representation with Wasserstein-2-regularized optimization. This creates a promising framework that integrates generative policies with reinforcement learning.

News

  • [2025/06/10] 🔥 We release the PyTorch version of the code.
  • [2025/09/18] 🎉 Our paper has been accepted to NeurIPS 2025.

    Introduction

    FlowRL is an Actor-Critic framework that leverages flow-based policy representation and integrates Wasserstein-2-regularized optimization. By implicitly constraining the current policy to the optimal behavioral policy via W2 distance, FlowRL achieves superior performance on challenging benchmarks like the DM_Control (Dog domain, Humanoid domain) and Humanoid_Bench.

    Getting Started

  1. Setup Conda Environment: Create an environment with

    conda create -n flowrl python=3.11
  2. Clone this Repository:

    git clone https://github.com/bytedance/FlowRL.git
    cd FlowRL
  3. Install FlowRL Dependencies:

    pip install -r requirements.txt
  4. Training Examples:

    • Run a single training instance:

      python3 main.py --domain dog --task run
    • Run parallel training:

      bash scripts/train_parallel.sh

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

TODO

  • Release JAX version source code

    Citation

    If you find FlowRL useful for your research and applications, please consider giving us a star ⭐ or cite us using:
@article{lv2025flow,
  title={Flow-Based Policy for Online Reinforcement Learning},
  author={Lv, Lei and Li, Yunfei and Luo, Yu and Sun, Fuchun and Kong, Tao and Xu, Jiafeng and Ma, Xiao},
  journal={arXiv preprint arXiv:2506.12811},
  year={2025}
}

About ByteDance Seed Team

Founded in 2023, ByteDance Seed Team is dedicated to crafting the industry’s most advanced AI foundation models. The team aspires to become a world-class research team and make significant contributions to the advancement of science and society.

邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9 京公网安备 11010802032778号