Font Size: a A A

Design And Implementation Of Phishing Email Detection System Based On Deep Learning

Posted on:2021-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:G Y SuFull Text:PDF
GTID:2428330632962651Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of Internet,email has become an essential part of enterprise.While the email system brings great convenience to people's life,it also brings a kind of cyber security risk:network attackers send elaborate constructed phishing emails to users to obtain users' personal information,which expose email users to the network security risks.In recent years,with the continuous research of phishing emails by enterprises,scholars and government agencies,network attackers have also improved the attack methods,using advanced network technologies such as short links,cloud attachments,and fake links to avoid the escalating phishing detection by the mail detection engine.In the current situation,how to quickly and efficiently detect them in the face of multiple content and forms of phishing emails has become one of the key research contents of the cyber security.Most of the current phishing email detection engines analyze emails'malicious behavior by analyzing the embedded links or attachments in the email.But this detection engine becomes powerless in the face of encrypted malicious attachment,anti-sandbox malicious attachment,short link url.In recent years,machine learning has been continuously developed.Detecting phishing emails by machine learning algorithms has also become the mainstream method of phishing detection.However,the amount of email data is constantly increasing and the number of phishing email scenarios is increasing.The upper limit of the machine learning model determines that it is impossible to detect all kinds of phishing emails.With the deepening research of phishing emails,it becomes a feasible research direction to detect phishing emails by analyzing body of the emails using deep neural networks methods.This article is devoted to using deep learning technology to detect phishing emails,result are mainly composed of the following three aspects:Firstly,an automatic tagging algorithm for phishing emails based on improved Livingston distance is studied.The screening of phishing emails must be done manually,which is time-consuming and laborious.To this end,we propose an automatic tagging algorithm for email data.This method automatically tag the phishing emails by calculating the similarity of phishing features between email sample and manual confirmation phishing with a kind of improved Lewinston distance.Secondly,a phishing email detection method based on LSTM neural network is studied.In view of the fact that most of the phishing emails involve inductive text,the LSTM neural network is used to analyze the semantic information of the email body.In addition,the LSTM model requires equal length of input data.This has led many researchers to use data truncation or supplementary by invalid bytes to meet the training needs of LSTM.This method will inevitably introduce redundant bytes or cause information loss.To this end,this article uses a masking matrix to improve the LSTM algorithm,so that email body data of different lengths can be input into the LSTM model for training or testing,so as to efficiently and accurately detect phishing emails.Thirdly,developed a phishing email detection system.In order to accurately detect phishing emails,this article design a phishing detect system by sandbox method and deep learning method,introduce overall process and various modules of the system in detail.Finally,by several comparison experiments,the accuracy of the phishing email automatic labeling algorithm,the deep learning-based phishing email detection algorithm and the designed phishing email detection system are verified.The experimental results show that the two algorithms proposed in this paper have achieved good results.In the designed phishing email detection system,the algorithm as an auxiliary detection module can detect a certain number of phishing emails that cannot be detected by the sandbox method.At the same time,the various indicators of the system have achieved a certain degree of improvement compared to the traditional phishing email detection system.
Keywords/Search Tags:Phishing email, Deep learning, Sentiment analysis
PDF Full Text Request
Related items