Font Size: a A A

Research And Implementation Of Website Abnormal Changes Detection System

Posted on:2018-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y WuFull Text:PDF
GTID:2348330512988368Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the progress of Internet technology and epoch of big data arriving,the website has become the government agency,enterprise and institutions,culture and media,scientific research institution,financial institutions' platform of Info diffusion and integrated application.The use of the site increased year by year.Web page content is complicated.To protect the site information security,authority and accuracy and for the public to disseminate the right information and services is the responsibilities of the website owners.However,the security threats that the site faced are becoming more and more serious,the behavior of illegal invasion and tampering with the website emerges in an endless stream.The real-time monitoring and tampering technology of the website has become a hot topic in the field of information security.Design and develop the website abnormal changes detection system for security issues is of great significance.Therefore,this paper presents a set of website anomaly changes detection system to solve these problems.First of all,this paper studies the characteristics of abnormal changes in web pages,queries a variety of website content security system software principles and technology.Through the combination of advantages and disadvantages of comparison and research,the final selection of the website abnormal changes detection system is based on the Hadoop platform through detecting changes in the contents of the site file.The main functions of the system are expected to achieve the main functions include website data acquisition,website abnormal change detection,abnormal data management,which includes crawling a large number of complete website data files,file data HDFS storage and initial filtering,to detect the specific content of the website changes and the legitimacy of the judge,the abnormal data management,etc.Using the Hadoop platform provided the file management system HDFS and MapReduce distributed computing model,the huge website file data has processed.The system crawls a large number of website file data and uses HDFS storage,and design index storage to speed up data search.The MD5 information digestalgorithm and the improved text comparison algorithm based on graph theory are used to detect the abnormal change of the system,combining with MapReduce calculation model to achieve rapid and accurate anomaly detection.The judgment of the illegal link is transformed into the IP address by the URL address.The judgment of the illegal vocabulary is combined the classical Naive Bayesian classification algorithm in data mining with Chinese word segmentation technology,then the abnormal changes information is classified and filtered out.Through the system design,implementation and testing,the system basically meet the needs of detecting the abnormal changes of of the website in performance and function.The system also exhibited a stable,efficient,error-free during operation.
Keywords/Search Tags:Website Detection, Hadoop, MD5, Text Comparison Algorithm, Naive Bayesian
PDF Full Text Request
Related items