Font Size: a A A

Research On Browser Fingerprint Anomaly Detection Model Based On Machine Learning

Posted on:2020-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:T CaoFull Text:PDF
GTID:2428330596481782Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
By the beginning of 2018,the number of websites in China has exceeded 5 million,and the level of national informationization has been increasing.However,while the construction of Internet websites in China is developing rapidly,website security is facing a huge threat.For the Internet finance website,the National Internet Emergency Center selected more than 1,000 security testing and evaluation,and found more than 400 high-risk vulnerabilities in the website,including XSS cross-site scripting attacks,file upload attacks,SQL injection attacks and other vulnerabilities.Therefore,In the era of the country's increasing emphasis on information planning,the security defense of information networks is particularly important.Traditional intrusion detection and security defense mechanisms are usually intercepted according to rules or blacklists,such as firewalls and anti-virus software.However,the current hacking technology is more complex and changeable.Network security should turn passive defense into active detection,and identify abnormal traffic before network attacks occur.Machine learning methods have been used as research hotspots and have been widely used in text,image and speech.The machine learning method can complete the recognition,prediction and decision-making of specific scenes by learning and training a large amount of data.Therefore,the Web intrusion detection technology combined with the machine learning method is more suitable for the intricate network environment,and it is expected to break the limitations of the traditional methods,discover new network security threats,and promote the development of Web network security intrusion detection.Web security issues are becoming more and more complex.The methods for Web site attacks include inserting malicious code through Web pages,constructing malicious Web requests,and controlling the execution of malware,etc.Therefore,the main objects of Web attack detection are submitted malicious code,Web request information,and domain name of controlling the host's malicious.This paper focuses on the object of the Web request structure and the DGA(Domain generation algorithms)domain name,the DGA domain name is the domain name generated by the control host to avoid detection using a random algorithm.Comprehensive consideration of the basic ideas of abnormal detection and the method of machine learning,this paper extracts the domain name information and structure information from the browser fingerprint information,constructs the request fingerprint information,and uses the information entropy and N-gram idea to extract the domain name characteristics as the classification basis of the DGA domain name,and choose a random forest classification algorithm for DGA domain name filtering.The Locality-Sensitive Hashing algorithm and the Levenshtein distance metric algorithm are used as the basics of the hierarchical clustering of structural information,and the regular expression is used as the classification prediction method to detect abnormal structures.Finally,this paper proposes a browser fingerprint anomaly detection model based on machine learning method,focusing on the request structure of the browser in the web request process and the malicious domain name in the request communication,aiming at reducing the analysis cost of malicious code in Web security detection and improving Web security detection performance,optimization of Web intrusion anomaly detection efficiency.This paper constructs a browser fingerprint anomaly detection model framework based on machine learning method.The framework consists of three parts: data acquisition module,data preprocessing module and data analysis module.In the data analysis module,the DGA domain name filter sub-model and the structural anomaly sub-model are connected in series to form a comprehensive anomaly detection model.The two sub-models work together,replace the single sub-model with a comprehensive model,and apply the machine learning method and text analysis method to the Web safety detection field.The final experimental results show that the proposed DGA domain name filter sub-model has high detection rate and low false positive rate.The structural anomaly sub-model has certain validity in identifying structural anomalies in malicious requests.The comprehensive model has good performance in terms of detection performance and time performance.
Keywords/Search Tags:anomaly detection, DGA domain, browser request structure, random forest, hierarchical clustering
PDF Full Text Request
Related items