Text Feature Representation And Classification Based On Deep Learning

Posted on:2017-07-21

Degree:Master

Type:Thesis

Country:China

Candidate:J Liang

Full Text:PDF

GTID:2348330485483643

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

As the development of the Internet and a large number of unstructured text which humanity produces overall, so does need to automatically extract the different types of knowledge from it. The thesis attempts to use deep learning to learn text features, in particular its structure and meaning in order to solve multiple higher level language tasks, such as sentiment analysis, filtering spam. By analyzing sentiment on microblog and review and filtering risk on Baidu's advertising, the results show the effectiveness of the proposed method. The main works of the thesis are:The thesis has summarized several common network structures of deep learning. Based on the potential issues of those methods, a new deep model�PRAE(Polarity transfer model based Recursive Auto-Encoder) is proposed. By training the model can learn the structure and grammar information from the text, while training with the label information of the text can be used for text categorization. This model can not only learn the word embedding representation, but also can learn recursively the sentence representation through the topological structure of the sentence. So we can automatically learn the features of the text by PRAE, thus avoiding the manually designed features. Experiments show that through deep learning model to learn text feature using for text categorization can achieve state of the art performance.Another work of the thesis is that we propose to extent LSTM(long short term memory) memory unit and recursive neural network(RSNN) to sentiment polarity shifting model which is called PLSTM-RSNN model. The model combines the advantages of recursive neural network and LSTM memory unit. The structure information of sentence can be learned by recursive neural network, and long short term memory networks are a special kind of RNN, which a memory cell can reflect the history memories of last time cells. We employ the model PLSTM-RSNN for text sentiment analysis, experiments results show that our composition outperformed the traditional neural-network composition.Finally, the thesis describes that I use the Baidu's paddle platform to build LSTM RNN neural network to solve the problem of risk filtering during internship. Practice has proved that deep learning models has a strong adaptability for different types of risk, and its efficiency can reach the level of practical application.

Keywords/Search Tags:

deep learning, text categorization, feature learning, RNN, LSTM

PDF Full Text Request

Related items

1	Research On News Text Classification Model Based On Deep Learning
2	Semantic Understanding Of Chinese Short Sentences Based On Deep Learning
3	Research On Text Categorization Technology Based On Deep Learning
4	A Study On Text Categorization Based On Machine Learning
5	Research On Multi-instance Multi-labe Learning Based On Feature Learning
6	Research And Implementation Of Text Classification Based On Depth Learning Theory And SVM Technology
7	Research On Text Classification Based On Hybrid Model Of Deep Learning
8	Sentiment Text Classification Research Integrating CNN And Bi-LSTM Deep Learning Algorithms
9	Research On Text Clustering Algorithm Based On Deep Learning Feature Extraction
10	Research On Network Text Sentiment Classification Based On Deep Learning