Font Size: a A A

Research On Webshell Detection Based On Machine Learning

Posted on:2020-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z K XianFull Text:PDF
GTID:2438330620955611Subject:Information security
Abstract/Summary:PDF Full Text Request
In recent years,China's Internet has developed rapidly,and while the network has brought more convenience to our lives,the issue of network security has gradually entered the field of vision.Website security issues are not only about the company's internal network security,but also threaten people's personal information.Webshell is the most important part of the attacker's attack on the website,so detecting the Webshell in the website is very important for website security.At present,the research on Webshell detection is slow,on the one hand because of its rapid growth,and on the other hand,compared with the advancement of our detection technology,attackers often upgrade Webshell to avoid new tests to produce new variants.In order to solve the problem that the traditional static detection method can not effectively deal with encryption and confusion,and the traditional dynamic detection method will seriously affect the operation efficiency when facing a large amount of data,this paper mainly studies the use of machine learning to detect Webshell.The main work contents and innovations are as follows:(1)Firstly,based on the analysis of the collected Webshell samples,five sample feature extraction methods are summarized,which can effectively solve the problem that the encryption confusion can not be detected in static analysis.The artificial bee colony algorithm may exist in the source selection process.Convergence and genetic drift are improved from the two aspects of selection probability calculation and poor solution replacement.The support vector machine is used as the classifier for websehll classification detection,and the improved artificial bee colony algorithm is used to optimize its training parameters C and ?.(2)Analyze the binary image visualization technology of existing malware,and propose a Webshell image visualization method based on Opcode sequence.The interpretive language such as PHP will be translated into Opcode function in turn,so Opcode can effectively represent Webshell and It is not affected by the encryption confusion.The Opcode call sequence is represented by a matrix and then converted into a two-dimensional gray image.Then the two grayscale images generated by the Opcode frequency feature are combined to synthesize the RGB image.(3)A According to the image visualization algorithm proposed above,KNN is used to identify a few types of RGB images,to identify the minority samples,and then to use the convolutional neural network model to classify the grayscale images of the two-dimensional Opcode sequence.A weighted approach to Softmax loss is proposed to amplify the weights of a few classes in loss calculations,thereby balancing the weight effects of different classes in the training process.The experimental results show that the improved artificial bee colony algorithm is better than the unimproved algorithm for the support of Webshell.The RGB image has a significant effect on the classification of Webshell.The proposed Opcode sequence image uses deep learning method.The Webshell detection has achieved good results,and the softmax loss weighting algorithm can effectively improve the final detection performance.
Keywords/Search Tags:Webshell, Support Vector Machine, Convolutional Neural Network, K-nearest Neighbor, Artificial Bee Colony Algorithm
PDF Full Text Request
Related items