Font Size: a A A

The Cross-site Script Detection Based On Deep Learning

Posted on:2021-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q ChengFull Text:PDF
GTID:2428330611450432Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays,online user information leakages often happen frequently.As a part of network attack detection,cross-site script detection is one of the focuses of researchers in the field of network security.While most of the traditional cross-site script detection techniques with machine learning methods have some defects such as bad readability,insufficient feature extraction and low efficiency caused by malicious code confusion.Therefore,an XSS detection model based on deep learning is proposed in this paper,which effectively improves the feature extraction capability of the model for cross-site scripts,improves the accuracy of model detection and reduces the false positives rate in model detection.The main work of this paper is as follows:1.By analyzing the characteristics of XSS script code,the characteristics of local correlation and long-distance dependence in XSS codes are summarized in this paper.With the in-depth research on these two characteristics,the conclusion that there is no correlation between the two characteristics are drawn in this paper.Based on this point of view,the models are built to verify the conclusion.In the process of experiment,the crawler tool was used to collect more than 100,000 sample data,including cross-site script samples and normal samples.In order to improve the readability of the data,various decoding techniques were used to solve the confusion of sample data,and word2 vec was used to convert the sample data in text form into word vector data.2.Due to cross site scripting there exists long-distance dependent characteristics of the context such as a long section of meaningless statements between the start tag and the end tag,some defects in extracting the above dependency features in the cross-site script code is existed in of the available long-short term memory network(LSTM)model.Therefore,in this paper a bidirectional long-short term memory network(Bi LSTM)model was proposed to fully extract context dependent on characteristics of cross-site scripting,and a softmax classifier was used for detecting cross-site scripting.Experimental results show that,compared with the LSTM model,the Bi LSTM model is able to extract more features,and the detection accuracy is improved by 1.5%.3.Owing to the features of high local correlation and long-distance dependencies in cross-site scripts,but the features extracted by the Bi LSTM model ignores the partial features,and the information that is not highly relevant to cross-site scripts is including.Thus,the Encoder-Decoder framework improved by attention mechanism was applied to build a cross-site script detection model in this paper.First,Aiming at the characteristics of local relevance and long distance dependence,which are not of high relevance,Encoder was composed of convolutional neural network and bidirectional gated loop unit network in parallel,considering the context information and the partial features extraction of cross-site scripts,the effective features were fully extracted as much as possible.Secondly,the attention mechanism was used to calculate the attention weight of input data on cross-site scripts,which solved the "distraction problem" of the traditional Encoder-Decoder framework.Finally,the detection performance of 99.27% accuracy rate and 0.05% false positive rate was obtained of the cross-site script detection model.
Keywords/Search Tags:cross-site script detection, bidirectional long-short term memory network, encoder-decoder framework, convolutional neural network, bidirectional gated recurrent unit network, attention mechanism
PDF Full Text Request
Related items