Font Size: a A A

Research Of The Key Technologies In Electronic Document Security Inspection

Posted on:2018-09-28Degree:MasterType:Thesis
Country:ChinaCandidate:G YangFull Text:PDF
GTID:2348330518992940Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the information era,data is a valuable property for enterprise.Protecting the data property from leakage is of great significance.This paper works on the researching of data leakage prevention,and aims at establishing a feasible file security inspective tool,thus prevent sensitive data from leakage.For this purpose,this paper conducts researches on the key technologies of it,which include document classification technology and document structure inspection.In order to improve the accuracy of text-document security inspection,we introduce the text classification technology to categorize documents,and make an improvement to the traditional feature selection algorithm:tf-idf.tf-idf treats the document set as a whole,and neglects the effect of category difference.Aiming at this problem,this paper proposes the new tf-DE algorithm.By introducing the concept of inter-category dispersion and intra-category information intropy,tf-DE measures feature term's distribution inside category and among categories,and thus solves the defect of tf-idf.In the experimental comparison with other eight frequently-used algorithms,tf-DE performs consistently better than others do,and thus is a more effective feature selection algorithm.In order to take precautions against the data hiding method base on file structure,this paper studies some common file type's data structure.For Microsoft Office document,we design strategies against hiding method like extra file embedding,image embedding,and OLE embedding.In addition,we also implement file type verification and file-end locating method based on file structure knowledge.Based on the researches above,this paper designs and implements the File Security Inspection System.This system supports the detection of several information hiding method based on file type structure,the active attack to image steganography and text category checking based on text classification technology.
Keywords/Search Tags:text classification, feature selection, tf-idf, steganalysis, data leakage prevention
PDF Full Text Request
Related items