Font Size: a A A

Clustering Analysis And Application Of Big Data In IPv6 Network Security Log Based On Hadoop

Posted on:2020-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:J WengFull Text:PDF
GTID:2428330578455879Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous maturity and popularization of the IPv6 protocol of the next Generation Internet Technology,Web sites that support IPv6 access have gradually become the mainstream.All kinds of events occurring on Web sites will produce corresponding network security logs,which record user's access behavior.Achieving the effective analysis of IPv6 network security logs,and deeply exploring the valuable potential information,not only can understand the user's access behavior habits,but also discover hidden Web attacks,so as to maintain the security of the Web server system.Since entering the era of big data,IPv6 network security log data generated by huge network user traffic has already reached the level of TB or PB or even more.Facing with such a large amount of IPv6 network security log data,the centralized log analysis technology of a single host can no longer meet the needs of data storage and computing.To solve this problem,this thesis designs and implements an IPv6 network security log analysis system based on the Hadoop distributed platform.The system aims to achieve efficient storage management and rapid analyze the large-scale Web logs,distinguish normal visits from Web attacks as correctly as possible,and improve the security of Web sites.The innovations of this thesis mainly include:(1)In view of the fact that the centralized K-means algorithm on single machine cannot effectively process massive data,an improved K-means algorithm based on MapReduce is proposed.Firstly,the maximum and minimum distance method and contour coefficient are used to optimize the initial clustering center selection of K-means algorithm,and then the parallelization of K-means algorithm is improved based on MapReduce.(2)Extracts the feature vectors which have great differences between normal access and attack behavior in the IPv6 network security logs,and builds a normal access user behavior model through the improved MapReduce-based efficient K-means algorithm,which improves the ability of correctly distinguishing normal access from Web attack in the process of IPv6 network security log analysis.(3)Designs and implements an IPv6 network security log analysis system based on Hadoop and its workflow.The system can effectively complete the distributed collection,storage,pre-processing and analysis of large data of IPv6 network security logs,and greatly improve the efficiency of large data analysis of IPv6 network security logs.The experimental results showed that,compared with the traditional K-means algorithm,the improved K-means algorithm based on MapReduce proposed in this thesis has higher accuracy and stability,and has better acceleration ratio and scalability when running on the Hadoop cluster.The IPv6 network security log analysis system based on Hadoop designed in this thesis has higher detection rate and lower false detection rate in identifying Web attacks.
Keywords/Search Tags:IPv6, Log Analysis, MapReduce, K-means, User Behavior Model
PDF Full Text Request
Related items