ISSN 0021-3454 (print version)
ISSN 2500-0381 (online version)
Menu

12
Issue
vol 68 / December, 2025
Article

DOI 10.17586/0021-3454-2025-68-12-1034-1045

UDC 004.896

STRUCTURED REINFORCEMENT LEARNING FOR TIME-OPTIMAL QUADROTOR FLIGHT

M. Barhoum
ITMO University, Saint Petersburg, 197101, Russian Federation; PhD Student


A. A. Pyrkin
ITMO University, Saint Petersburg, 197101, Russian Federation; Full Professor, Dean

Reference for citation: Barhoum M., Pyrkin A. A. Structured reinforcement learning for time-optimal quadrotor flight. Journal of Instrument Engineering. 2025. Vol. 68, N 12. P. 1034–1045. DOI: 10.17586/0021-3454-2025-68-12-1034-1045.

Abstract. The problem of synthesizing reactive, time-optimal control for quadcopters is aggravated by their multifaceted, underactuated dynamics and the complexity of solving boundary-value problems in real time. This work addresses these challenges, presenting a reinforcement learning framework that learns to autonomously navigate in collision-free environments with optimal waypoint-reaching policies. Our contributions include a cascaded actor architecture inspired by position-velocity separation in classical control to improve flight stability and smooth actions, as well as a composite reward function incorporating radial velocity and acceleration components, promoting maximal progress toward targets and steering the agent toward bang-bang-like maneuvers. Quantitative comparisons prove that our agent achieves smooth control actions, leading to optimal trajectories that adhere tightly with minimal deviations to the desired path.
Keywords: quadrotors, reinforcement learning, autonomous navigation, optimal trajectory, neural networks

Acknowledgement: Supported by the Ministry of Science and Higher Education of the Russian Federation (project no. FSER-2025-0002) and ITMO University Research Projects in AI Initiative (RPAII) №640112.

References:
  1. Richter C., Bry A., and Roy N. Robotics Research, Springer Tracts in Advanced Robotics, 2016, рр. 649–666, DOI:10.1007/978-3-319-28872-7_37.
  2. Foehn P., Romero A., and Scaramuzza D. Science Robotics, 2021, no. 6(56), DOI:10.1126/scirobotics.abh1221.
  3. Pěnička R. and Scaramuzza D. IEEE Robotics Automation Letters, arXiv:2202.03947v1 [cs.RO] 8 Feb 2022.
  4. Romero A., Sun S., Foehn P., and Scaramuzza D. IEEE Transactions on Robotics, 2022, vol. 99, pp. 1–17, DOI:10.1109/ TRO.2022.3173711.
  5. Khojasteh M.S. and Salimi-Badr A. IEEE Open Journal of Vehicular Technology, 2024, no. 6(99), pp. 34–51, DOI:10.1109/OJVT.2024.3502296.
  6. Zhong L., Zhao J., Luo H., and Hou Z. Proceedings of the 36th Chinese Control and Decision Conference, Under Review, Xi’an, China, May 25–27, 2024.
  7. Tsai T.-H. and Li Q. 3rd International Conference on Industrial Artificial Intelligence (IAI), 2021, DOI:10.1109/ IAI53119.2021.9619200.
  8. Wang J., Wang T., He Z., Cai W. Applied Intelligence, 2022, no. 1(53), DOI:10.1007/s10489-022-03503-6.
  9. Li X., Yu H., Hu M., Xiao L., Han J., and Fang Y. International Conference on Advanced Robotics and Mechatronics (ICARM), Guilin, China, July 09–11, 2022, pp. 1076–1081, DOI: 10.1109/ICARM54641.2022.9959439.
  10. Himanshu K., Kumar H., and Pushpangathan J.V. IFAC-PapersOnLine, 2022, no. 22(55), pp. 281–286, DOI:10.1016/j. ifacol.2023.03.047.
  11. Mokhtar M. and El-Badawy A. International Conference on Unmanned Aircraft Systems, June 2023, DOI:10.1109/ ICUAS57906.2023.10156126.
  12. Trad T.Y., Choutri K., Lagha M., Meshoul S., Khenfri F., Fareh R., and Shaiba H. Computers, Materials amp. Continua, 2024, no. 3(81), pp. 4757–4786, DOI:10.32604/cmc.2024.055634.
  13. Wang Y., Sun J.L., He H., and Sun C. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2020, no. 10(50), pp. 3713–3725.
  14. Lopez-Sanchez I. and Moreno-Valenzuela J. Annual Reviews in Control, 2023, vol. 56, рр. 100900, DOI: 10.1016/j. arcontrol.2023.100900.
  15. Idrissi M., Salami M.R., and Annaz F.Y. Journal of Intelligent and Robotic Systems, 2022, no. 2(104), pp. 22, DOI: 10.1007/s10846-021-01527-7.
  16. Ren Y., Zhu F., Sui S., Yi Z., and Chen K. Drones, 2024, no. 7(8), pp. 315, DOI:10.3390/drones8070315.
  17. Rub´ı B., Morcego B., and P´erez R.A. Autonomous Robots, 2021, vol. 45, pp. 119–134.
  18. Mien T., Tu T., and An V. International Journal of Robotics and Control Systems, 2024, no. 2(4), pp. 814–831, ttps:// pubs2.ascee.org/index.php/IJRCS/article/view/1410.
  19. Idres M., Mustapha O., and Okasha M. IOP Conference Series: Materials Science and Engineering, 2017, no. 1(270), pp. 012010, https://dx.doi.org/10.1088/1757-899X/270/1/012010.
  20. Noordin A., Basri M.A.M., Mohamed Z., and Lazim I.M. Arabian Journal for Science and Engineering, 2020, vol. 46, pр. 963–981.
  21. Shah S., Dey D., Lovett C., and Kapoor A. Field and Service Robotics, 2017, https: //arxiv.org/abs/1705.05065.