Font Size: a A A

Network Characteristics Analysis For IP Geolocation

Posted on:2021-06-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:F X YuanFull Text:PDF
GTID:1488306230972069Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
Network entity IP geolocation(IP geolocation for short)refers to estimating the geographic locations of network entities configured with IPs(such as routers,end hosts,etc.)through various technologies.Carrying out research on IP geolocation and related technologies can provide technical support for location-based service promotion,sensitive network target geolocation,network fraud and malicious attack behavior forensics,network situational awareness,etc.,and has important theoretical significance and application value.At present,with the deepening of research,among the existing methods,IP geolocation based on network measurement becomes the mainstream method.For this type of method,regional network topology construction and regional topology analysis based on special structures are very important.However,the traditional network topology-related researches rarely focus on IP geolocation,and the obtained topology is not regional;there is a lack of analysis of the different special structures with urban attributions included in regional topologies.To this end,based on the analysis of network characteristics such as delay and path in the actual Internet environment,this dissertation focuses on how to apply these characteristics to regional network topology construction,regional topology analysis based on special structures,and target IP geolocation.The main work is as follows.1.The theoretical value and practical significance of the research on IP geolocation technology are elaborated;the current mainstream IP geolocation methods based on network measurement are introduced;the research progress of traditional network topology analysis is introduced in detail;several problems urgently to be resolved in regional network topology construction and regional topology analysis based on special structures are pointed out.2.Aiming at the problem of boundary route IP identification in the determination of regional network topology boundary,an algorithm of urban network topology boundary route IP identification for IP geolocation is proposed.Firstly,in order to obtain information such as delay and path,vantage points(VPs for short)are deployed inside and outside the target city to probe landmarks inside the city.The route IP hostname information is obtained from public data sources.Secondly,single-hop delays between route IPs are calculated and the maximum value within the city is used as the threshold,and similarities between route IP hostnames are also calculated.Thirdly,for paths obtained from VPs outside the city,each single-hop delay is compared with the threshold in sequence from back to front,and the IP between single-hop delays from less than the threshold to greater than the threshold is identified as the boundary IP.Finally,for paths without significant delay changes,similarities of host name strings of two adjacent route IPs are compared from back to front,and the boundary IP in the path is identified according to the change of similarities.Experiments are carried out in many cities in China and the US.The results show that compared with the traditional boundary IP identification method based on delay distribution and method based on statistical analysis of paths,the proportion of paths that can be processed by the proposed algorithm is higher,with an average increase of about 106.9% and 73.4%;boundary IPs identified by the proposed algorithm are more accurate,and the accuracy of IP-level nodes obtained has increased by an average of about 51.8% and 17.3%,respectively.3.Alias resolution is very important for accurately acquiring nodes and links,constructing regional network topology,and thus supporting special structure analysis and IP geolocation.A large-scale network alias resolution algorithm for IP geolocation is proposed.Firstly,a certain number of known alias IP and non-alias IP samples of a specific target area are obtained from a public data source,VPs are deployed to probe samples as well as IPs to be resolved in the area,and delays and paths are obtained.Secondly,a vector containing four-dimensional features such as delay similarity and path similarity for each pair of samples is constructed,and the feature vectors of samples are input into the classifier to train the classification model.Thirdly,the filtering rules are designed to exclude IPs that cannot be aliases among the ones to be resolved.Finally,for IPs to be resolved after filtering,feature vectors are constructed and input into the trained model to get the classification result.Experiments are conducted based on millions of samples located in some areas of China and the US from CAIDA.The results show that compared with existing typical methods such as Radar Gun,MIDAR as well as Tree NET,the accuracy of the proposed algorithm is improved by about 15.8%,4.8%,and 5.7%,respectively,and the time consumption is reduced by at most 77.8%,65.3%,and 55.2%.4.Homogeneous blocks belonging to a city can be used for target IP geolocation.A homogenous block identification algorithm for IP geolocation is proposed.Firstly,for a specific target area,IPs with the same locations provided by multiple databases and landmarks in the area are obtained as targets.Secondly,each /31 containing one target IP is used as a candidate block,VPs are deployed to probe IPs in the block to obtain delays and paths.The proposed alias resolution algorithm is used to merge some route IPs in paths.Thirdly,according to the designed discrimination conditions,whether the candidate block is homogenous is identified.Next,the proposed boundary IP identification algorithm is used to analyze the city-level location of each IP within the homogeneous block and determine whether the block belongs to a certain city.Finally,the size of the homogeneous block is expanded step by step and the new block is identified until the largest homogeneous block containing one target IP and belonging to a city is identified.Experiments are carried out in many cities in China and the US.The results show that the proposed algorithm has a high accuracy for identifying homogenous blocks,and the location accuracy of IPs within a block is up to 99.4%.The identified homogenous blocks are used for target IP geolocation,and the geolocation accuracy of probing reachable targets is about 95.7% on average.The identified homogenous blocks are used for landmark expansion and the number of landmarks can be significantly increased,thereby the success rates of existing IP geolocation methods are also increased.5.Complete PoPs in a specific area are very important for target IP geolocation.A high completeness PoP partition algorithm for IP geolocation is proposed.Firstly,for a specific target area,IPs with the same locations provided by multiple databases and landmarks in the area obtained by the proposed homogeneous block identification algorithm are used as target IPs.Secondly,VPs are deployed to probe target IPs to obtain delays and paths.The proposed boundary IP identification algorithm is used to obtain route nodes in the area.Thirdly,subnet analysis is performed on the route nodes and subnet IPs are acquired and probed.And new route nodes in the area are obtained from paths.The process of subnet analysis,subnet IPs probe,and new route nodes acquisition is iterated continuously until a large number of new nodes can no longer be acquired.Next,the proposed alias resolution algorithm is used to merge some nodes.Finally,a large number of Bi-fans with common nodes are extracted and PoPs are partitioned.Experiments are carried out in Henan province and Florida.The results show that compared with existing typical methods such as PoP-Geo and PoP-NTA,PoPs obtained by the proposed algorithm contain more nodes and have higher integrity,and the number of nodes is increased by approximately 5.2 and 4.2 times,respectively.These high-integrity PoPs are used for IP geolocation,the minimum failure rate is only about 0.6%,and these PoPs are expected to successfully geolocate some target IPs that are unreachable but with paths through a PoP.At last,the whole work is summarized and problems to be studied in the future are pointed out.
Keywords/Search Tags:IP geolocation, Network measurement, Network characteristic analysis, Boundary IP identification, Alias resolution, Homogenous block identification, PoP partition
PDF Full Text Request
Related items