Font Size: a A A

Research On Network User Identification Technology In High-speed Traffic Environment

Posted on:2018-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:P L PanFull Text:PDF
GTID:2348330542968904Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,the number of Internet users and devices explosively grows along with the rapid development of Internet.However,the Internet has become a hotbed for cybercrime and various criminal incidents occur frequently.Therefore,it is imperative to enhance the network surveillance.Then,network traffic based user identification becomes one of the hottest research topics.In most of the previous work,network user identification is treated equally with network device identification.However,the detection rate significantly decreases in case that a user owns multiple devices or one device is shared by multiple users.The existing network device identification technologies based on network traffic use physical signals to distinguish the nuances of hardware devices,however,its recognition ability is weak.Alternatively,network protocol stack parameters are leveraged to identify the operating systems.Nevertheless,it can only achieve coarse-grained recognition.Network traffic based user identification technologies employ Web records to identify users.Only a few features are extracted and they take a long time to identify users.Therefore,they barely achieve effective online user identification.In addition,traditional centralized computing technologies cannot meet the needs of real-time analysis and high-speed network traffic processing.To address these issues,in this thesis,a distributed online network user identification technology is proposed and implemented.Specifically,the major work includes the following 3 aspects:1.A high-speed network traffic analysis technology based on distributed computing is proposed.The packets in the high-speed network are captured using the PFQ kernel module,and then transmitted to the distributed processing module using Kafka.The distributed processing module is responsible for protocol identification,application identification,User-Agent detection,DNS(Domain Name Servers)resolving,etc.Then,it extracts the relevant data from network traffic and stores these data into HBase.2.An online user identification technology is developed.Firstly,according to devices'runtime environment,a total of 961 features including softwares,operating system,and the fields of User-Agent are extracted to form device fingerprints.Then,diverse classification algorithms are used to train the data and verify their effectiveness.Finally,the logistic regression model is chosen to online identify the network devices with a slide window.Then,according to users' network behavior,a total of 57593 features including Web records,DNS information,and the fields of User-Agent,are first generated and then various classification algorithms are used to train the data and verify their effectiveness.Finally,the Multinomial Naive Bayes model is chosen with a slide window to achieve the online user identification.Accuracy rate of online user recognition can achieve 79.51%within 5 minutes.3.Integrating the distributed network traffic analysis technology and the online user identification technology,a prototype online user identification system is designed and implemented.It can be used to identify network users online.In summary,this thesis studies and implements the online user identification technology.The packets in high-speed network are captured using the PFQ kernel module and processed in a distributed system.Finally,a prototype online user identification system is designed and implemented.It can effectively identify the network users online.
Keywords/Search Tags:Traffic Analysis, Device Identification, Device Fingerprinting, User Identification, Behavioral Fingerprinting
PDF Full Text Request
Related items