Font Size: a A A

Design Of Security Text Classification System Based On Ensemble Algorithm

Posted on:2019-12-08Degree:MasterType:Thesis
Country:ChinaCandidate:K W GuFull Text:PDF
GTID:2428330566499367Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer and information technology,electronic information leakage has gradually become an important issue for enterprises.Though Data Loss Prevention(DLP)systems detect leaks by monitoring the flows of information,it can not prevent the occurrence of such incidents.By classifying the internal documents of an enterprise,the DLP system effectively protects the data from illegal access according to label access control.At present,classification method of the tags is mainly based on artificial fixed-density.The open problems include fixed-density non-standardization,unclear rules and subjectivity,which result in inaccurate fixed-density,time-consuming and cost-consuming to enterprises.Based on existing technologies,this dissertation studies the classification of aided secret-level texts through machine learning and deep learning methods.Experimental methods mainly solve the following intelligent fixed-density problems:(1)The proposed text categorization method in a certain field is studied to verify the performance in the occation of enterprise classified document classification and to analyze the implementation feasibility;(2)Feature selection and feature extraction in text categorization techniques is studied based on multi-angle extraction of secret-related features and reduction of feature noise.Results prove the improvement of classifier accuracy;(3)Design the method of word embedding text representation and convolutional neural network for security classification.Analysis the use of document word order and n-gram to extract security-level features;(4)For the generalization performance of system,an individual classifier fusion method based on multiple integration strategies is proposed to reduce the risk of over-fitting,under-fitting and;(5)Finally verify the feasibility of the classification method of dense texts through experiments,and verify the effectiveness of integrated algorithms in classification prediction.
Keywords/Search Tags:Data loss prevention, Document security identification, Feature extraction, Deep learning, Model ensemble
PDF Full Text Request
Related items