Font Size: a A A

Research On Intelligent Device Identification Technology Based On Machine Learning

Posted on:2023-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:L YaoFull Text:PDF
GTID:2568306791481524Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of network technology,in addition to traditional host devices such as personal computers and servers,a large number of new smart devices such as web cameras,printers,and smart homes are also connected to the internet.The wide application of smart devices not only brings convenience to people but also causes many security problrms.First,the software of smart devices is relatively hardened,and the computing and storage resources are limited.It is usually inconvenient to install security systems such as anti-virus software and terminal firewalls,resulting in insufficient security protection capabilities.Second,the number of smart devices is huge,the types are complex,and they are widely distributed.It is difficult for operation and maintenance personnel to control the number,distribution,and patch updates of devices in the network.Third,users lack security awareness and necessary security skills.These factors have brought about a series of cyberspace security problems,such as the easy remote control of smart devices by hackers,and the inability to fail at critical moments.It is of great importance to strengthen the identification of cyberspace device assets,comprehensively grasp the attributes of smart devices,such as types,application services,and operating systems,discover "risk assets" such as long-term unmanned management and illegal access to the network,quickly locate and dispose of devices with vulnerabilities,and grasp the network security situation,to improve network security protection capabilities.Automatically and accurately identifying smart devices in cyberspace is challenging.The traditional device detection and identification technology are to identify the device by sending a protocol request to some devices,and then matching the fields of the response data with the manually predefined keywords.This manual-based device identification method has a large workload,is prone to cause errors,and has poor timeliness in updating the rule base.It is difficult to comprehensively detect and master cyberspace assets.In this regard,this paper proposes a machine learning-based smart device identification technology.The main work and achievements are as follows:(1)Theories and technologies related to smart device identification are sorted out.From the three dimensions of smart device,key technologies and typical applications,the architecture of smart devices is studied,the attribute of smart devices is sorted out,the basic principles and key technologies of smart devices identification are expounded,and typical applications of smart device identification are introduced.Finally,the technical architecture of smart device identification is designed.According to the characteristics of the two different network environments,the local area network and the Internet,this paper expounds the main reasons for the research on two different technologies: passive identification of smart devices based on network traffic fingerprints and machine learning,and active identification of smart devices based on Web fingerprints and neural networks.(2)A passive identification technology of smart devices based on network traffic fingerprint and machine learning is proposed.Aiming at the identification problem of smart devices in the LAN environment,a Smart Dev Id fingerprint extraction and identification method based on network traffic is designed.This method obtains the network traffic of communication between devices through passive listening,and uses the traffic analysis technology to analyze the data packets to extract the behavior characteristics and payload information of the device as the fingerprint of the smart device.Finally,traditional machine learning algorithms such as decision trees are used to model smart device fingerprints to realize device type identification.To solve the problem of data imbalance,a resampling method is proposed to expand the small sample data.To further improve the accuracy of device identification,an aggregation packet algorithm based on MAC address is designed.To test the feasibility of the Smart Dev Id fingerprint extraction method,this paper conducts experimental validation using a single data packet from three datasets such as Aalto University dataset.The experimental results show that the classification accuracy of the Smart Dev Id fingerprint method proposed in this paper is improved by 6% compared with the Io T Sentinel fingerprint method.To evaluate the performance of the aggregated packet algorithm,the classification accuracy of Smart Dev Id fingerprint method at the single packet level on the three datasets is 74.6%,85.8% and 88.7% respectively.By using the aggregation packet algorithm based on MAC address,the classification accuracy is 91.7%,99.6%and 99.2% respectively.Compared with using a single packet to identify smart devices,the MAC address-based aggregated packet algorithm proposed in this paper achieves higher device identification accuracy.(3)An active identification technology of smart device based on Web fingerprint and neural network is proposed.Aiming at the problem of large-scale acquisition of response data from smart devices in the Internet environment,a web crawler based on asynchronous stateless scanning is designed.It mainly discovers the online IP address by performing a stateless SYN scan on the target network,and then further sends a "GET" request to the online IP address to obtain the response data.Aiming at the problem of smart device Web fingerprint generation and identification,a neural network-based smart device Web fingerprint automatic generation and identification scheme is designed.It extracts text by parsing response data with HTML parser,extracts Web fingerprint in text based on natural language processing technology,and uses a neural network model to model smart device Web fingerprint to realize smart device attribute identification.To verify the feasibility of the active identification technology of smart devices,four neural network models including convolutional neural network were studied,and the experimental verification was carried out using the real data detected.The experimental results show that among the four neural network models,the recurrent convolutional neural networks model has the best classification performance,with a recognition accuracy of 90.59% and a Weighted F1-score of 89.43%,and the time cost to train the model is about 25 minutes.Compared with the traditional device identification method based on machine learning,the smart device active identification technology based on Web fingerprint and neural network proposed in this paper has the advantages of automatic extraction and identification of Web fingerprint and fine-grained accuracy.This technology can not only identify a variety of new smart device types but also can effectively identify the manufacturer and product information.
Keywords/Search Tags:Cyberspace asset identification, Smart devices, Fingerprint identification, Classifier
PDF Full Text Request
Related items