Azar A T, Koubaa A, Ali Mohamed N, et al. Drone deep reinforcement learning: A review[J]. Electronics, 2021, 10(9): 999. [Paper]
Wang X, Wang Y, Su X, et al. Deep reinforcement learning-based air combat maneuver decision-making: literature review, implementation tutorial and future direction[J]. Artificial Intelligence Review, 2024, 57(1): 1. [Paper]
Hanover D, Loquercio A, Bauersfeld L, et al. Autonomous drone racing: A survey[J]. IEEE Transactions on Robotics, 2024. [Paper]
Richter D J, Calix R A, Kim K. A review of reinforcement learning for fixed-wing aircraft control tasks[J]. IEEE Access, 2024. [Paper]
RL Environments
Rennie G. (Gym-JSBSim) Autonomous control of simulated fixed wing aircraft using deep reinforcement learning[J]. 2018. [Paper][Code]
Bøhn E, Coates E M, Moe S, et al. (Fixed-Wing-Gym) Deep reinforcement learning attitude control of fixed-wing UAVs using proximal policy optimization[C]//2019 international conference on unmanned aircraft systems (ICUAS). IEEE, 2019: 523-533. [Paper][Code]
Madaan R, Gyde N, Vemprala S, et al. Airsim drone racing lab[C]//Neurips 2019 competition and demonstration track. PMLR, 2020: 177-191. [Paper][Code]
Song Y, Naji S, Kaufmann E, et al. Flightmare: A flexible quadrotor simulator[C]//Conference on Robot Learning. PMLR, 2021: 1147-1157. [Paper][Code]
Liu Q, Jiang Y, Ma X. Light Aircraft Game: A lightweight, scalable, gym-wrapped aircraft competitive environment with baseline reinforcement learning algorithms. Github. 2022. [Code]
Chan J H, Liu K, Chen Y, et al. Reinforcement learning-based drone simulators: survey, practice, and challenge[J]. Artificial Intelligence Review, 2024, 57(10): 281. [Paper]
Gong X, Dawei F, Xu K, et al. VVC-Gym: A Fixed-Wing UAV Reinforcement Learning Environment for Multi-Goal Long-Horizon Problems[C]//The Thirteenth International Conference on Learning Representations. 2025. [Paper][Code]
Kulkarni M, Rehberg W, Alexis K. Aerial Gym Simulator: A Framework for Highly Parallelized Simulation of Aerial Robots[J]. IEEE Robotics and Automation Letters, 2025. [Paper][Code]
Xu B, Gao F, Yu C, et al. Omnidrones: An efficient and flexible platform for reinforcement learning in drone control[J]. IEEE Robotics and Automation Letters, 2024, 9(3): 2838-2844. [Paper][Code]
Robustness
Mysore S, Mabsout B, Mancuso R, et al. Regularizing action policies for smooth control with reinforcement learning[C]//2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021: 1810-1816. [Paper]
O’Connell M, Shi G, Shi X, et al. Neural-fly enables rapid learning for agile flight in strong winds[J]. Science Robotics, 2022, 7(66): eabm6597. [Paper]
Loquercio A, Kaufmann E, Ranftl R, et al. Learning high-speed flight in the wild[J]. Science Robotics, 2021, 6(59): eabg5810. [Paper]
Sim2Real
Loquercio A, Kaufmann E, Ranftl R, et al. Deep drone racing: From simulation to reality with domain randomization[J]. IEEE Transactions on Robotics, 2019, 36(1): 1-14. [Paper]
Vision-Action-Language Models
Serpiva V, Lykov A, Myshlyaev A, et al. RaceVLA: VLA-based Racing Drone Navigation with Human-like Behaviour[J]. arXiv preprint arXiv:2503.02572, 2025. [Paper]
Applications
Trajectory tracking
Loquercio A, Kaufmann E, Ranftl R, et al. Deep drone racing: From simulation to reality with domain randomization[J]. IEEE Transactions on Robotics, 2019, 36(1): 1-14. [Paper]
Chen J, Yu C, Xie Y, et al. What Matters in Learning A Zero-Shot Sim-to-Real RL Policy for Quadrotor Control? A Comprehensive Study[J]. IEEE Robotics and Automation Letters, 2024. [Paper]
Racing / Agile Flight
Moon H, Martinez-Carranza J, Cieslewski T, et al. Challenges and implemented technologies used in autonomous drone racing[J]. Intelligent Service Robotics, 2019, 12: 137-148. [Paper]
Loquercio A, Kaufmann E, Ranftl R, et al. Deep drone racing: From simulation to reality with domain randomization[J]. IEEE Transactions on Robotics, 2019, 36(1): 1-14. [Paper]
De Wagter C, Paredes-Vallés F, Sheth N, et al. The artificial intelligence behind the winning entry to the 2019 ai robotic racing competition[J]. arXiv preprint arXiv:2109.14985, 2021. [Paper]
Song Y, Steinweg M, Kaufmann E, et al. Autonomous drone racing with deep reinforcement learning[C]//2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021: 1205-1212. [Paper]
Song Y, Scaramuzza D. Policy search for model predictive control with application to agile drone flight[J]. IEEE Transactions on Robotics, 2022, 38(4): 2114-2130. [Paper]
Kaufmann E, Bauersfeld L, Scaramuzza D. A benchmark comparison of learned control policies for agile quadrotor flight[C]//2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022: 10504-10510. [Paper]
De Wagter C, Paredes-Vallés F, Sheth N, et al. The sensing, state-estimation, and control behind the winning entry to the 2019 artificial intelligence robotic racing competition[J]. Field Robotics, 2022, 2: 1263-1290. [Paper]
Song Y, Romero A, Müller M, et al. Reaching the limit in autonomous racing: Optimal control versus reinforcement learning[J]. Science Robotics, 2023, 8(82): eadg1462. [Paper]
Kaufmann E, Bauersfeld L, Loquercio A, et al. Champion-level drone racing using deep reinforcement learning[J]. Nature, 2023, 620(7976): 982-987. [Paper]
Fu J, Song Y, Wu Y, et al. Learning deep sensorimotor policies for vision-based autonomous drone racing[C]//2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2023: 5243-5250. [Paper][Video]
Zhang Y, Hu Y, Song Y, et al. Learning vision-based agile flight via differentiable physics[J]. Nature Machine Intelligence, 2025: 1-13. [Paper][Project]
Attitude Control / Velocity Vector Control
Koch W, Mancuso R, West R, et al. Reinforcement learning for UAV attitude control[J]. ACM Transactions on Cyber-Physical Systems, 2019, 3(2): 1-21. [Paper]
Bøhn E, Coates E M, Reinhardt D, et al. Data-efficient deep reinforcement learning for attitude control of fixed-wing UAVs: Field experiments[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 35(3): 3168-3180. [Paper]
Gong X, Dawei F, Xu K, et al. V-Pilot: A Velocity Vector Control Agent for Fixed-Wing UAVs from Imperfect Demonstrations[C]. IEEE International Conference on Robotics and Automation. 2025. [Code]
BFMs
Clarke S G, Hwang I. Deep reinforcement learning control for aerobatic maneuvering of agile fixed-wing aircraft[C]//AIAA Scitech 2020 Forum. 2020: 0136. [Paper]
Kong W, Zhou D, Yang Z, et al. Maneuver strategy generation of UCAV for within visual range air combat based on multi-agent reinforcement learning and target position prediction[J]. Applied Sciences, 2020, 10(15): 5198. [Paper]
Clarke S G, Hwang I. Deep reinforcement learning control for aerobatic maneuvering of agile fixed-wing aircraft[C]//AIAA Scitech 2020 Forum. 2020: 0136. [Paper]
Pope A P, Ide J S, Mićović D, et al. Hierarchical reinforcement learning for air-to-air combat[C]//2021 international conference on unmanned aircraft systems (ICUAS). IEEE, 2021: 275-284. [Paper]
Pope A P, Ide J S, Mićović D, et al. Hierarchical reinforcement learning for air combat at DARPA’s AlphaDogfight trials[J]. IEEE Transactions on Artificial Intelligence, 2022, 4(6): 1371-1385. [Paper]
Hu W, Gao Z, Quan J, et al. Fixed-wing stalled maneuver control technology based on deep reinforcement learning[C]//2022 IEEE 5th International Conference on Big Data and Artificial Intelligence (BDAI). IEEE, 2022: 19-25. [Paper]
Cao S, Wang X, Zhang R, et al. From demonstration to flight: realization of autonomous aerobatic maneuvers for fast, miniature fixed-wing UAVs[J]. IEEE Robotics and Automation Letters, 2022, 7(2): 5771-5778. [[Paper]] (https://ieeexplore.ieee.org/abstract/document/9720976/)
Li L, Zhang X, Qian C, et al. Basic flight maneuver generation of fixed-wing plane based on proximal policy optimization[J]. Neural Computing and Applications, 2023, 35(14): 10239-10255. [Paper]
Yin Z, Zheng C, Guo S, et al. TACO: General Acrobatic Flight Control via Target-and-Command-Oriented Reinforcement Learning[J]. arXiv preprint arXiv:2503.01125, 2025. [Paper]
Combat
Bae J H, Jung H, Kim S, et al. Deep reinforcement learning-based air-to-air combat maneuver generation in a realistic environment[J]. IEEE Access, 2023, 11: 26427-26440. [Paper]
awesome-RL-for-UAVs
Surveys
Azar A T, Koubaa A, Ali Mohamed N, et al. Drone deep reinforcement learning: A review[J]. Electronics, 2021, 10(9): 999. [Paper]
Wang X, Wang Y, Su X, et al. Deep reinforcement learning-based air combat maneuver decision-making: literature review, implementation tutorial and future direction[J]. Artificial Intelligence Review, 2024, 57(1): 1. [Paper]
Hanover D, Loquercio A, Bauersfeld L, et al. Autonomous drone racing: A survey[J]. IEEE Transactions on Robotics, 2024. [Paper]
Richter D J, Calix R A, Kim K. A review of reinforcement learning for fixed-wing aircraft control tasks[J]. IEEE Access, 2024. [Paper]
RL Environments
Rennie G. (Gym-JSBSim) Autonomous control of simulated fixed wing aircraft using deep reinforcement learning[J]. 2018. [Paper] [Code]
Bøhn E, Coates E M, Moe S, et al. (Fixed-Wing-Gym) Deep reinforcement learning attitude control of fixed-wing UAVs using proximal policy optimization[C]//2019 international conference on unmanned aircraft systems (ICUAS). IEEE, 2019: 523-533. [Paper] [Code]
Madaan R, Gyde N, Vemprala S, et al. Airsim drone racing lab[C]//Neurips 2019 competition and demonstration track. PMLR, 2020: 177-191. [Paper] [Code]
Song Y, Naji S, Kaufmann E, et al. Flightmare: A flexible quadrotor simulator[C]//Conference on Robot Learning. PMLR, 2021: 1147-1157. [Paper] [Code]
Liu Q, Jiang Y, Ma X. Light Aircraft Game: A lightweight, scalable, gym-wrapped aircraft competitive environment with baseline reinforcement learning algorithms. Github. 2022. [Code]
Chan J H, Liu K, Chen Y, et al. Reinforcement learning-based drone simulators: survey, practice, and challenge[J]. Artificial Intelligence Review, 2024, 57(10): 281. [Paper]
Gong X, Dawei F, Xu K, et al. VVC-Gym: A Fixed-Wing UAV Reinforcement Learning Environment for Multi-Goal Long-Horizon Problems[C]//The Thirteenth International Conference on Learning Representations. 2025. [Paper] [Code]
Kulkarni M, Rehberg W, Alexis K. Aerial Gym Simulator: A Framework for Highly Parallelized Simulation of Aerial Robots[J]. IEEE Robotics and Automation Letters, 2025. [Paper] [Code]
Xu B, Gao F, Yu C, et al. Omnidrones: An efficient and flexible platform for reinforcement learning in drone control[J]. IEEE Robotics and Automation Letters, 2024, 9(3): 2838-2844. [Paper] [Code]
Robustness
Mysore S, Mabsout B, Mancuso R, et al. Regularizing action policies for smooth control with reinforcement learning[C]//2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021: 1810-1816. [Paper]
O’Connell M, Shi G, Shi X, et al. Neural-fly enables rapid learning for agile flight in strong winds[J]. Science Robotics, 2022, 7(66): eabm6597. [Paper]
Loquercio A, Kaufmann E, Ranftl R, et al. Learning high-speed flight in the wild[J]. Science Robotics, 2021, 6(59): eabg5810. [Paper]
Sim2Real
Vision-Action-Language Models
Applications
Trajectory tracking
Loquercio A, Kaufmann E, Ranftl R, et al. Deep drone racing: From simulation to reality with domain randomization[J]. IEEE Transactions on Robotics, 2019, 36(1): 1-14. [Paper]
Chen J, Yu C, Xie Y, et al. What Matters in Learning A Zero-Shot Sim-to-Real RL Policy for Quadrotor Control? A Comprehensive Study[J]. IEEE Robotics and Automation Letters, 2024. [Paper]
Racing / Agile Flight
Moon H, Martinez-Carranza J, Cieslewski T, et al. Challenges and implemented technologies used in autonomous drone racing[J]. Intelligent Service Robotics, 2019, 12: 137-148. [Paper]
Loquercio A, Kaufmann E, Ranftl R, et al. Deep drone racing: From simulation to reality with domain randomization[J]. IEEE Transactions on Robotics, 2019, 36(1): 1-14. [Paper]
De Wagter C, Paredes-Vallés F, Sheth N, et al. The artificial intelligence behind the winning entry to the 2019 ai robotic racing competition[J]. arXiv preprint arXiv:2109.14985, 2021. [Paper]
Song Y, Steinweg M, Kaufmann E, et al. Autonomous drone racing with deep reinforcement learning[C]//2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021: 1205-1212. [Paper]
Song Y, Scaramuzza D. Policy search for model predictive control with application to agile drone flight[J]. IEEE Transactions on Robotics, 2022, 38(4): 2114-2130. [Paper]
Kaufmann E, Bauersfeld L, Scaramuzza D. A benchmark comparison of learned control policies for agile quadrotor flight[C]//2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022: 10504-10510. [Paper]
De Wagter C, Paredes-Vallés F, Sheth N, et al. The sensing, state-estimation, and control behind the winning entry to the 2019 artificial intelligence robotic racing competition[J]. Field Robotics, 2022, 2: 1263-1290. [Paper]
Song Y, Romero A, Müller M, et al. Reaching the limit in autonomous racing: Optimal control versus reinforcement learning[J]. Science Robotics, 2023, 8(82): eadg1462. [Paper]
Kaufmann E, Bauersfeld L, Loquercio A, et al. Champion-level drone racing using deep reinforcement learning[J]. Nature, 2023, 620(7976): 982-987. [Paper]
Fu J, Song Y, Wu Y, et al. Learning deep sensorimotor policies for vision-based autonomous drone racing[C]//2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2023: 5243-5250. [Paper] [Video]
Zhang Y, Hu Y, Song Y, et al. Learning vision-based agile flight via differentiable physics[J]. Nature Machine Intelligence, 2025: 1-13. [Paper] [Project]
Attitude Control / Velocity Vector Control
Koch W, Mancuso R, West R, et al. Reinforcement learning for UAV attitude control[J]. ACM Transactions on Cyber-Physical Systems, 2019, 3(2): 1-21. [Paper]
Bøhn E, Coates E M, Reinhardt D, et al. Data-efficient deep reinforcement learning for attitude control of fixed-wing UAVs: Field experiments[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 35(3): 3168-3180. [Paper]
Gong X, Dawei F, Xu K, et al. V-Pilot: A Velocity Vector Control Agent for Fixed-Wing UAVs from Imperfect Demonstrations[C]. IEEE International Conference on Robotics and Automation. 2025. [Code]
BFMs
Clarke S G, Hwang I. Deep reinforcement learning control for aerobatic maneuvering of agile fixed-wing aircraft[C]//AIAA Scitech 2020 Forum. 2020: 0136. [Paper]
Kong W, Zhou D, Yang Z, et al. Maneuver strategy generation of UCAV for within visual range air combat based on multi-agent reinforcement learning and target position prediction[J]. Applied Sciences, 2020, 10(15): 5198. [Paper]
Clarke S G, Hwang I. Deep reinforcement learning control for aerobatic maneuvering of agile fixed-wing aircraft[C]//AIAA Scitech 2020 Forum. 2020: 0136. [Paper]
Pope A P, Ide J S, Mićović D, et al. Hierarchical reinforcement learning for air-to-air combat[C]//2021 international conference on unmanned aircraft systems (ICUAS). IEEE, 2021: 275-284. [Paper]
Pope A P, Ide J S, Mićović D, et al. Hierarchical reinforcement learning for air combat at DARPA’s AlphaDogfight trials[J]. IEEE Transactions on Artificial Intelligence, 2022, 4(6): 1371-1385. [Paper]
Hu W, Gao Z, Quan J, et al. Fixed-wing stalled maneuver control technology based on deep reinforcement learning[C]//2022 IEEE 5th International Conference on Big Data and Artificial Intelligence (BDAI). IEEE, 2022: 19-25. [Paper]
Cao S, Wang X, Zhang R, et al. From demonstration to flight: realization of autonomous aerobatic maneuvers for fast, miniature fixed-wing UAVs[J]. IEEE Robotics and Automation Letters, 2022, 7(2): 5771-5778. [[Paper]] (https://ieeexplore.ieee.org/abstract/document/9720976/)
Li L, Zhang X, Qian C, et al. Basic flight maneuver generation of fixed-wing plane based on proximal policy optimization[J]. Neural Computing and Applications, 2023, 35(14): 10239-10255. [Paper]
Yin Z, Zheng C, Guo S, et al. TACO: General Acrobatic Flight Control via Target-and-Command-Oriented Reinforcement Learning[J]. arXiv preprint arXiv:2503.01125, 2025. [Paper]
Combat