Font Size: a A A

Research On The Operating System Identification Method Based On Ensemble Learning In Ipv6

Posted on:2022-09-07Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhaoFull Text:PDF
GTID:2518306566962239Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Operating system(OS)identification tools play a critical role in the reconnaissance phase of penetration testing.Traditional OS identification is carried out by using passive or active tools based on fingerprinting databases.It is rare to see identification that focuses on machine learning technology.At the same time,most of these tools are suitable for IPv4 network.With the development of IPv6 network,It is urgently needed for an operating system identification tool or method,which is suitable for IPv6 network.This paper adopted two methods to conduct accurate OS identification.The first was based on the collaborative neural network ensemble of the unique voting system.This method uses a multi-level structure,which depends on voting the output of the upper neural network to decide the next lower neural network,while the second was based on the random forest algorithm.Both methods carried out the OS identification using IPv6 features and metadata functionality of the data package to identify the passive operating system.The experiments in this paper showed that both methods were effective:(1)When using data sets containing only Windows and Linux data packages,the neural network ensemble identification method in this paper had an average accuracy rate of 84.8%;the random forest composed of 30 decision trees had the average accuracy rate of 93.6%.(2)The effect of additional training on the identification accuracy of neural network ensemble is studied.It was found that additional training allowed this method to achieve an average accuracy of 92.9%,which was 8.1% higher than the previous method.(3)When the Mac OS data package was introduced into the dataset,the neural network ensemble could also gain an average accuracy rate of 76.0 and it can still maintain 100% recognition accuracy for Windows operating system.The average comprehensive accuracy of random forest algorithm is 89.6%,which has higher recognition accuracy than neural network ensemble.(4)The operating system identification method based on random forest algorithm has higher recognition accuracy and faster running speed,which is an excellent and fast operating system passive identification method;(5)Although the operating system recognition method based on neural network ensemble needs longer training time than that based on random forest algorithm,once the training is completed,it can complete large-scale continuous operating system identification in a very short time.According to the comparison,the average comprehensive accuracy of the method based on neural network ensemble is 5.68% higher than that based on decision tree algorithm,and it is 10.6% higher than that based on the Support vector machine(SVM)algorithm;17.19% higher than that based on the Naive Bayes Algorithm.The OS identification based on random forest algorithm was 19.28% higher than that based on decision tree algorithm;24.2% higher than that based on SVM algorithm;30.79%higher than that based on Naive Bayes Algorithm.The above research showed that the identification method based on neural ensemble network and random forest algorithm exerted a good effect in terms of the OS identification.Compared with the operating system recognition method based on decision tree algorithm,SVM vector machine and naive Bayes algorithm,it has more advantages in recognition accuracy and running time.
Keywords/Search Tags:IPv6, OS identification, artificial neural network ensemble, random forest
PDF Full Text Request
Related items