| With the development of technologies such as cloud computing,remote office,and paperless office,the global data volume is increasing rapidly.From 2020 to 2025,the global data volume will rise from 44ZB to as high as 175ZB.Businesses have incurred huge losses.According to statistics,an average of 3.8 pieces of malware will be generated per second,but in the field of network security,a skilled analyst can only analyze 12.8 samples a day on average,and there is a serious imbalance between attack and defense.Therefore,it is necessary to find effective technologies to detect malwares in a timely and accurate manner.The characteristics of many types of malwares,large quantity,and fast mutation speed greatly increase the difficulty of malware detection.In addition,existing deep learningbased malware detection methods have the risk of being easily bypassed by malware developers in terms of robustness.Therefore,around the problem of malware detection,this paper conducts research on how to efficiently and accurately detect malwared,how to improve the robustness of malware detection models based on deep learning,and how to design an efficient and safe malware detection system.A series of new methods.The main work and innovations of this paper are as follows:(1)In view of the problem that traditional malware classification methods rely heavily on manual labor and the low accuracy of malware classification methods based on deep learning,an end-to-end malware classification method MCARC based on malware machine code byte stream is proposed.Based on the operating characteristics of malware,our method uses CNN to extract neighborhood information of malware and multi-head attention mechanism to extract full-text information of malware.Experimental results show that the method achieves 91.6%macroAccuracy and 90%macro-F1 value on the malware dataset BODMAS.(2)Aiming at the problem that deep learning methods are susceptible to interference and may give wrong results,a robust reinforcement method for malware classification models is proposed,and a robust malware model R-MCARC is obtained through training with this method.Based on common malware perturbation techniques,this method proposes two modification perturbation strategies that do not affect malware functions and two random perturbation strategies that destroy malware executables,and designs and verifies the online reinforcement process of GAN and stochastic perturbation offline reinforcement process.The experimental result shows that the proposed perturbation method reduces the macro-F1 value of the MCARC model without robust enhancement from 90%to 82%,while the robust enhanced model R-MCARC achieves a macro-Accuracy of 91.8%and a macro-F1 of 90.3%,which proves the effectiveness of the robustness enhancement method.(3)According to the needs of users and security personnel,based on the methods proposed in this paper,a malware detection system is designed and implemented.The system can collect user system information through the agent.When suspicious behavior occurs,malware classification is performed based on both the traditional method and the deep learning model,and the agent program performs security emergency response.Security personnel can perform malware analysis to perceive malware threats.In addition to uploading malware samples by security personnel,the system will regularly obtain new malware samples through crawlers,and automatically optimize the model to ensure timely malware detection. |