Font Size: a A A

Research On Malware Detection Using Windows API

Posted on:2022-10-19Degree:MasterType:Thesis
Country:ChinaCandidate:K LeiFull Text:PDF
GTID:2518306341454584Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In recent years,computer technology has continued to develop,and the level of computer software and hardware has also been continuously improved,and more and more malicious software has emerged.There are more and more intrusions against Windows hosts,and traditional network security measures are difficult to adapt to the security detection requirements of the Windows environment.In order to solve the network security problems in the Windows environment,researchers have proposed Windows intrusion detection technology based on machine learning.Among the machine learning algorithms,random forest,K-Means,SVM and other algorithms are widely used in intrusion detection,but these algorithms have high complexity,weak model generalization ability,and long detection time when the amount of data is large.The algorithm that uses integrated learning Boosting is LightGBM,and the algorithm that uses integrated learning Bagging idea is Random Forest.The Bagging idea is very simple,that is,each sub-data set generates a weak learner,and then a strong learner is determined by voting.For simple data sets,random forests are simple and efficient.LightGBM is a decision tree algorithm based on histogram algorithm,which can discretize a large number of relatively continuous values.Through the research of intrusion detection and integrated algorithm,the LightGBM algorithm is finally selected as the algorithm of the intrusion detection system,and the accuracy,precision,recall,and F-1 measurement of the intrusion detection results are evaluated.The main contents of the thesis are as follows:(1)For the processing of data sets,this research group made full use of the multiple public data sets collected on Windows API calls.For the differences in these data sets,we merged these data sets to form two sets of data sets.By comparing these two sets of data sets,a set of data sets with good effects is selected to be added to our intrusion detection system.After finishing the data set,we first preprocess the data,fully mine the data information,and find 296 important API calls.Taking into account the differences between different malwares,the data set is divided into 9 parts,representing nine different types of malicious software.software.(2)This paper designs the use of LightGBM intrusion detection system in Windows environment,uses Python’s Sklearn library to train the data set,and adjusts the learning rate in LightGBM,the maximum depth of the tree,the feature selection ratio and the data ratio of each iteration The parameters determined the ideal parameter range,and achieved high classification accuracy on the test set.(3)As a horizontal comparison,using the same data set,based on the LightGBM detection,the regression tree,decision tree,random forest,GBDT,XGBoost and LightGBM algorithms were used for anomaly detection.In terms of training time and accuracy,the results are compared with the LightGBM algorithm to evaluate and optimize the model.The accuracy,precision,F-1 value and AUC value of LightGBM are all higher than other machine learning models,all above 97%.
Keywords/Search Tags:machine learning, intrusion detection, Windows API, LightGBM
PDF Full Text Request
Related items