| With the rapid development of phased array technology,radars have developed from early single-function and single-computer operations to multi-function and network cooperation.Compared with single radars,netted radars can achieve information fusion of various radars in the network by appropriately placing stations for multiple radars of different systems,frequency bands and polarisation methods,thus obtaining more reliable and richer information than single radars and posing a great threat to the jamming party.The traditional jamming process tends to target the frequency,amplitude,phase and other information domains of intercepted radar’s signals,and does not have the ability to jam multiple radars simultaneously and respond to changes in radar mission status in real time,making it difficult to meet the actual needs of modern warfare’s information and dynamic adjustments.For this reason,jammers are also developing towards multi-beam and Network formation,which have the advantages of spatial complementarity,frequency band complementarity and temporal synergy,and are of great theoretical and practical significance in enhancing the effectiveness of jamming against multifunctional netted radars.However,multi-beam jammers still face bottlenecks in the selection of jamming strategies and jamming power allocation against networked radars.Therefore,this paper focuses on the above-mentioned key issues faced in the countermeasure of netted radars to conduct an in-depth study.In this paper,the challenges of insufficient jamming power resources in self-defence jamming scenarios and the difficulties in dynamic decision making of jamming styles and parameters in remote area support scenarios are addressed.The paper proposes a jamming resource scheduling method and a Dyna-Q double-layer reinforcement learning-based jamming decision making method for the countermeasures against networked radars with limited jamming power.This paper presents a methodology for the scheduling of jamming resources in the case of limited jamming power and a Dyna-Q two-layer reinforcement learning-based jamming decision method to achieve adaptive power allocation of multi-beam jammers and dynamic decision making of jamming patterns and parameters,which can effectively improve the jamming effectiveness of grouped radars.The main research content of this paper includes the following three parts:1.Analysis of typical working modes of netted radar and research on jamming theory.Firstly,the different working modes of phased array radar are analyzed,and the working principle and characteristics of radar in several typical working modes of searching and tracking,target tracking and target recognition are summarized.Secondly,the generation mechanism of several typical radar jamming styles of noise frequency modulation interference,noise convolutional interference,comb spectral interference and dense false target interference is explored,and the characteristics of different interferences in the time,frequency and time frequency domains are analyzed.It lays a theoretical foundation for subsequent radar threat determination,interference resource scheduling and decision-making.2.Multi-beam jamming resource scheduling against netted radars.In the self-defense jamming scenario,the target to be protected faces the challenge of simultaneous multi-task detection and identification of netted radars.In this paper,we study a resource scheduling method for multi-beam jamming against networked radars to give full play to the advantages of multi-beam jamming systems and improve the effective utilization of jamming resources.Firstly,a radar confrontation model is constructed according to the actual confrontation scenario,and the confrontation process is deconstructed into four radar confrontation scenarios according to the dynamic change process of the netted radar working mode;next,the radar operating mode is identified using temporal convolutional network(TCN),and the radar threat level is mapped according to the radar operating mode;then,based on the observable information of the interferer,the degree of detection probability decline and the radar threat level are incorporated into the effectiveness evaluation of the jamming resource,and the jamming resource allocation objective function is constructed to dynamically perform a dynamic analysis of the jamming power,and use the particle swarm optimization algorithm(PSO)to solve the objective function;finally,in the experimental verification part,by constructing different jamming scenarios of multiple radar networks,the dynamic scheduling objective function of the jamming resources is solved,and the optimal configuration of the netted radar jamming resources is realized,which verifies the proposed effectiveness of the method.3.Hierarchical reinforcement learning radar jamming decision based on Dyna-Q.Aiming at the challenge that there are many jamming patterns in the remote support scenario,and the jamming patterns and jamming parameters are difficult to make dynamic decisions,a layered reinforcement learning radar jamming decision-making method based on Dyna-Q is proposed.Dual decision-making for jamming parameters,the introduction of Dyna structure combines the advantages of model-based reinforcement learning methods with fast convergence speed with model-free reinforcement learning methods to improve algorithm convergence speed and improve the fast decision-making ability of jamming patterns and parameters.First,this chapter gives a brief overview of the basic principles of reinforcement learning,Markov decision process and the principle of Dyna-Q algorithm;The working principle of the decision-making model is sorted out,and the jamming benefit function is designed according to the function of each layer.Finally,a typical remote support and confrontation scenario is constructed to experimentally verify the effectiveness of the proposed method,which proves the layering based on Dyna-Q.The reinforcement learning radar jamming decision-making method improves the convergence speed of jamming decision-making and realizes the dynamic decision-making ability of jamming patterns and parameters. |