Font Size: a A A

Research On Text Classification Based On Privacy Protection

Posted on:2019-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:F F FengFull Text:PDF
GTID:2438330572459564Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet,text categorization has become a basic research including personalized recommendation,personal customization,text categorization and so on.However,how to protect users ' privacy effectively in the process of text categorization has become one of the hot spots of research at home and abroad.Based on the existing research at home and abroad,this paper focuses on the combination of privacy protection and text classification.The text classification work,often includes the text preprocessing,the text classification algorithm and so on stage.The purpose of privacy protection is to conceal the user's sensitive information.Based on the original text categorization framework,this paper changes the preprocessing stage,increases the privacy protection,conceals the privacy information of the users,and plays a role in protecting the user's privacy by taking part in the text classification.Firstly,the privacy protection method based on the key feature chain is proposed for the privacy protection of the preprocessing stage.Disclosure of privacy information is often due to the overall disclosure of key privacy information,resulting in significant losses to users.In this paper,after the text preprocessing and word segmentation,through the identification of key privacy information,the text contains the user's key privacy information chain,and the privacy information chain for the virtual,complete the key privacy information chain to the subsequent application of the unknown,and then play the role of privacy protection.The core of this method is to construct the user's privacy information chain and to blur the privacy information.Through experimental verification,the key privacy information chain privacy protection is simple and effective,in the text preprocessing phase of access,can effectively identify the user privacy information chain and complete the virtual,for the subsequent text classification to provide a set of desensitization elements.Secondly,this paper presents a new SVM text classification algorithm based on privacy protection for spatial edge recognition.In this paper,the pre-processing stage of the existing text classification SVM algorithm is reformed,and the privacy protection method based on the key feature chain is applied to the algorithm to form a classification sample based on privacy protection and a set of text elements to be classified.By using the vector spatial grid representation and vector density calculation,the algorithm of spatial edge detection is reconstructed,and the calculation of high latitude feature is accomplished by kernel function,and the accuracy and efficiency of SVM text classification algorithm are improved.Experiments show that this method is effective and provides a method for fast text categorization.Finally,in order to facilitate the research,this paper designs and implements the experimental prototype platform for the privacy Protection and text classification of the preprocessing stage.The modular design of the individual research phase effectively promotes the ongoing and future research.
Keywords/Search Tags:Privacy protection, critical privacy chain, Space edge detection, SVM
PDF Full Text Request
Related items