Font Size: a A A

Research On The Construction Method Of Network Measurement Data Set Based On Feature Extraction

Posted on:2018-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:L D YanFull Text:PDF
GTID:2348330533956505Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The emergence and popularization of the Internet provided a great convenience for people in current society.People are also at risk of being threatened and fraudulent from the Internet while enjoying the convenience of the Internet.In recent years,rogue certificates have been issued by malicious events for many times.If the rogue certificate is obtained by lawbreakers and deployed to the phishing / fraud site,the user's personal information will be greatly increased the risk of theft.And it may also cause the loss of personal property and the decline of related business reputation.At present,automatic identification of rogue certificate,which necessary for the rogue certificate,is mainly used to identify the artificial.In view of the fact that the rogue certificate is difficult to identify and the lack of effective rogue certificate data set,this paper takes the rogue certificate as the research object,mainly completes the following three aspects:(1)Cooperate to research and build the rogue certificate original data set: According to the goal about constructing the rogue certificate data set,the paper combined with the real digital certificate data obtained by the network measurement and the rogue certificate simulation data generated by the Frankencert tool.Based on the features of digital certificate field and rogue certificate,this paper selected the 37 feature fields to construct the original rogue certificate data set,through removing exception certificates.Finally,we constructed 37-dimensional original rogue certificate data set(730,000 sample size).(2)Improve the feature selection algorithm and construct a new index model: According to the traditional Isomap algorithm,the improved algorithm MM-Isomap is proposed,which focuses on increasing the consideration of the sample point category,which is the same as minimizing the distance of intraclass and maximizing the distance of interclass.In this paper,the optimal parameters of the algorithm and the effect of the algorithm are evaluated by the accuracy,the recall rate and the F value.By applying the original data set to the rogue certificate,we get the 18-dimensions indicator model after feature extraction.(3)Validate the indicator model and build an open data set: This paper conducted two parts to test the effective of the result.The first is the machine learning algorithm such as SVM,J4.8 decision tree and BP neural network are used to evaluate the validity of the original data set of rogue certificate.The second is to evaluate the effectiveness of the new indicator model after feature extraction.At the same time,The second parts combined with the another student's work,which using the feature selection to process the original data set of rogue certificate.The rogue certifcate data set combined with "feature selection(22-dimensional)+ feature extraction(18 dimension)" was used to constructed open data sets.In order to further develop the rogue certificate research,and provide the basic data set support.
Keywords/Search Tags:Rogue Certificate, Dataset Construction, Feature Extraction, Isomap
PDF Full Text Request
Related items