Phishing Websites Detection Using Selected Features Classification And Bidirectional Long Short-Term Memory Neural Networks

Posted on:2019-02-14

Degree:Master

Type:Thesis

Country:China

Candidate:Y Du

Full Text:PDF

GTID:2348330569488948

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The progress of the times and the development of Internet technology bring convenience to our life,but there are also some information security problems at the same time.Phishing is a typical way to deceive users and get sensitive information.This kind of attack,which has a great economic profit,frequently happened.It weakens the mutual trust between Internet users and greatly slows the flourishing development of network trade.It can be seen that how to detect phishing websites accurately and efficiently becomes the focus of network information security research.Phishing websites detection technology based on machine learning is the hotspot of phishing research.The key points of the method are the construction of features and the selection of classification algorithms.First of all,this thesis makes a thorough study on related features of the phishing websites.We carry out a statistical analysis of twenty thousand URL samples(half positive and half negative samples),not only including common features such as URL and HTML,but also WHOIS information and ALEXA information.An efficient feature combination is constructed by feature selection algorithms,and then machine learning algorithms are used to classify websites and compare results.Experiments show that the random forest algorithm can distinguish phishing websites better.Phishing websites have short survival time and varied forms.The artificial feature extraction always depends on the prior knowledge of human.In the research mechanism mentioned above,some commonly used URL features can not effectively distinguish new phishing websites and the phishing websites detection with multi-feature fusion is inefficient.Therefore,we propose a phishing websites detection mechanism using neural network to learn URL sequence.Bi-directional LSTM can learn the serialization features and long-term dependencies,then capture the implicit dependency between the URL sequences.Thus,Bidirectional LSTM can be applied into the task of phishing websites detection.In this thesis,URL text is transformed to the word vector,and then the training data with positive and negative labels are sent into the neural network model.The classification model is trained by the reverse propagation algorithm.In order to verify the effectiveness of the classification model,cross validation experiments are carried out.The experimental results show that the proposed method in this thesis achieves higher Accuracy and Recall rate,and effectively reduces false positive rate and false negative rate.

Keywords/Search Tags:

Phishing Websites Detection, Multiple Features, Random Forest, Word Vector, Long Short-Term Memory neural network

PDF Full Text Request

Related items

1	Research On Data Processing Method Of Radar Targets Based On Long And Short-term Memory
2	Sentiment Analysis Of Short Text Based On Improved Bidirectional LSTM Neural Network
3	Fraudulent URL Detection Based On Big Data
4	Research On Phishing Website Detection Technology In Dual-structural Network
5	Research On Fall Detection Based On Long Short-term Memory Artificial Neural Network And Wrist Sensor
6	Research On Network Intrusion Detection Method Based On Bi-LSTM
7	Short Text Sentiment Classification Based On Deep Learning
8	Research On Tibetan Word Segmentation Algorithm Based On Deep Neural Network
9	Research On Deceptive Reviews Detection For Multiple Domains
10	Research On Phishing Detection Mechanism By Integrating New URL Features