Font Size: a A A

Design And Implementation Of Webshell Detection Method Based On Machine Learning And Multi-model Fusion

Posted on:2021-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:D Q ZhuFull Text:PDF
GTID:2518306557968049Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Webshell can be understood as a collection of command execution statements or execution programs in a web page file format.It is mainly called backdoor utilization in the industry.It is extremely harmful and its research has always been a key area of academic research..The current mainstream Webshell detection methods are mainly implemented based on established static rules and models.However,due to the frequent changes in regular rules commonly used in Webshells,the convergence is extremely low,and it is difficult to predict changes in the overall Webshell.Targeting on exploring the very sophisticated variety of Webshell,Firstly we focus on the basis of the existing mature Webshell regularized detection.On the basis of the existing mature Webshell regularized detection,the Webshell detection and analysis mechanism based on multiple scenarios and multiple algorithms is introduced to realize the prediction of Webshell changes and solve the problem of traditional introduction of algorithm combinations.The accuracy rate and recall rate can only be satisfied individually,and at the same time ensure that the algorithm convergence efficiency is at a stable level.The main work of this paper is as follows:(1)Analyse of requirements and overall design of Webshell detection prototype systemAccording to the mainstream detection scenes of Webshell,the scene analysis is carried out,and the basic Webshell threat scenarios such as mainstream string features,text features,and combined feature change trends of Webshell are sorted out,and the detection logic of Webshell is solidified;dynamic correlation analysis module and training are designed The module realizes the association and discrete analysis between the invocation of north-south atomization detection rules and the east-west atomization detection rules,and realizes dynamic and full-cycle Webshell monitoring;(2)Detailed system design based on the full life cycle coverage of WebshellUsing the method of key feature correlation analysis,extracting sample files based on static and dynamic features,further clarify the various scenarios of the Webshell full life cycle detection link in the overall design,and internalize the scenarios as training samples to string extraction,Classification and grading,dynamic feature extraction,data persistence module,scanning interaction module,training module,and threat tracing module,further enhance the abundance of correlation analysis at the sample level,and use the minimization model for training to avoid algorithmic inefficiency risks of.(3)Optimization and testing of down-exploring feature extraction capabilities based on convolutional neural networksIntroduce a drill-down training method,convert the feature code extracted by Webshell into an input index that can be recognized by machine learning,and output it as the training result of the index classification;then combine the N-Gram word bag model to drill down the training dimension,and it will proceed Semantic association weighting forms a low-coupling set of keyword groups,which is further combined with XGBoost for training of the above-mentioned low-coupling set,and the detection accuracy of Webshell can be improved through downward training.The test results show that after the feature extraction optimization algorithm proposed in this article is combined with the system prototype designed in this article,the recall rate reaches 99.6% and the accuracy rate is 94.1%.The recall rate and accuracy rate in emergency response scenarios such as major network security guarantees are lower.
Keywords/Search Tags:Machine Learning, Webshell, Text Feature Extraction, Convolutional Neural Network, N-Gram Bag of Words Model
PDF Full Text Request
Related items