# Research on Maneuver Decision of Unmanned Combat Aerial Vehicles Based on Segmented Reward Function and Improved Deep Q-Network

### Juntao Ruan, Juntao Ruan, Yi Qin, Fei Wang, Jianjun Huang, Fujie Wang, Fang Guo, Yaohua Hu

#### 2024

#### Abstract

Intelligent air combat is the main trend in the future, and the maneuvering decision-making ability of Unmanned combat aerial vehicles (UCAVs) affects the win-lose ending of the air battlefield. In order to study the problem of maneuver decision-making in UCAV 1V1 air combat, this paper proposes a maneuver decision generation algorithm based on the fusion of segmented reward function and improved deep Q-network. Firstly, this paper establishes a real mathematical model problem in the complex environment of air combat, and provides an equation expression that describes the spatial coordinate information, attitude information velocity information of the current state of UCAVs. This expression can provide basic maneuvering action instructions after passing the overload coefficient. Then, a segmented reward function was designed to guide unmanned aerial vehicles to develop towards their own advantages in the turn of aerial combat. Aiming at the problem of parameter uncertainty of deep Q-network (DQN) in maneuver decision-making process, an improved deep Q-network algorithm is proposed in the next. By introducing the extended Kalman filter (EKF), the uncertain parameter values of the strategy network are used to construct the system state equations, the parameters of the target network are used to construct the observation equations of the system, and the optimal parameter estimates of the DQN are obtained through the iterative updating solution of EKF. Simulation experiments show the effectiveness of the designed segmented reward function and the improved deep Q-network algorithm in autonomous maneuvering decision-making for UCAVs.

