| With the development of Io T,the variety and number of Io T devices have grown dramatically.As most intelligent terminals have limited hardware resources,many manufacturers do not take security into account,leaving Io T devices more vulnerable to malware attacks.In recent years,deep learning-based methods for identifying malware have avoided labor-intensive feature engineering and can achieve end-to-end malware identification with excellent performance.However,most research works have been devoted to improving model accuracy at the expense of model hardware resource consumption and real-time performance.As the proposed deep learning models are usually characterized by high complexity,they consume significant hardware resources such as computing power and memory,and a lot of time costs in both training and inference phases,making it impossible to deploy them on some Io T devices with limited hardware resources.Therefore,how to deploy a high-performance deep learning model on lightweight intelligent devices with limited hardware resources to identify malware accurately and efficiently is a pressing issue.To address the above issues,this thesis implements several malware identification methods and attempts two novel attention-based malware identification methods to select and compress the optimal method.To ensure that the compressed model has a low loss in performance,this thesis proposed two compression methods: a bidirectional constrained knowledge distillation,called Bi KD,and a compression framework combining neural architecture search and knowledge distillation,called Mal M2 L.To address the above issues,this thesis implements several malware identification methods and attempts two novel attention-based malware identification methods to select and compress the optimal method.To ensure that the compressed model has a low loss in performance,the main work is as follows:(1)Bi-directional constrained Knowledge Distillation(Bi KD)is proposed to compress large-scale malware identification models.Unlike previous work on knowledge distillation,Bi KD provides a constraint from the lightweight model for the training of the large-scale model,enabling the large-scale model to generate representations that are more conducive to lightweight model learning,such that the performance of the lightweight model can more closely match that of the large-scale model.The experimental results fully validate the effectiveness of Bi KD,and the performance of the compressed malware identification model can be very close to or even exceed that of the original model.(2)A compression framework for malware identification models based on neural architecture search and knowledge distillation,called Mal M2 L,is proposed.Mal M2 L first compresses a large-scale malware identification model into a lightweight model with optimal learning capabilities by incorporating knowledge distillation with neural architecture search.Then the lightweight model’s performance is further enhanced by similarity-preserving knowledge distillation.The experimental results show that the lightweight model generated by Mal M2 L outperforms most existing malware identification techniques in terms of accuracy,resource consumption,and real-time performance. |