Font Size: a A A

Text Feature Representation And Classification Based On Deep Learning

Posted on:2017-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:J LiangFull Text:PDF
GTID:2348330485483643Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As the development of the Internet and a large number of unstructured text which humanity produces overall, so does need to automatically extract the different types of knowledge from it. The thesis attempts to use deep learning to learn text features, in particular its structure and meaning in order to solve multiple higher level language tasks, such as sentiment analysis, filtering spam. By analyzing sentiment on microblog and review and filtering risk on Baidu's advertising, the results show the effectiveness of the proposed method. The main works of the thesis are:The thesis has summarized several common network structures of deep learning. Based on the potential issues of those methods, a new deep model—PRAE(Polarity transfer model based Recursive Auto-Encoder) is proposed. By training the model can learn the structure and grammar information from the text, while training with the label information of the text can be used for text categorization. This model can not only learn the word embedding representation, but also can learn recursively the sentence representation through the topological structure of the sentence. So we can automatically learn the features of the text by PRAE, thus avoiding the manually designed features. Experiments show that through deep learning model to learn text feature using for text categorization can achieve state of the art performance.Another work of the thesis is that we propose to extent LSTM(long short term memory) memory unit and recursive neural network(RSNN) to sentiment polarity shifting model which is called PLSTM-RSNN model. The model combines the advantages of recursive neural network and LSTM memory unit. The structure information of sentence can be learned by recursive neural network, and long short term memory networks are a special kind of RNN, which a memory cell can reflect the history memories of last time cells. We employ the model PLSTM-RSNN for text sentiment analysis, experiments results show that our composition outperformed the traditional neural-network composition.Finally, the thesis describes that I use the Baidu's paddle platform to build LSTM RNN neural network to solve the problem of risk filtering during internship. Practice has proved that deep learning models has a strong adaptability for different types of risk, and its efficiency can reach the level of practical application.
Keywords/Search Tags:deep learning, text categorization, feature learning, RNN, LSTM
PDF Full Text Request
Related items