Research On Text Classification Method Based On Feature Embedding Representation

Posted on:2021-01-17

Degree:Master

Type:Thesis

Country:China

Candidate:T S Wang

Full Text:PDF

GTID:2428330602464606

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

The development of computer technology accelerates the process of the information age,at the same time causes the exponential growth of data and the increasing of data processing workload.To process text data more efficiently,natural language processing(NLP)and its related researches have been paid much attention.Text classification,as a sub-task of NLP,is widely used in many fields,such as news categorization,digital library,sentiment analysis,and spam filtering.According to the current researches,the text classification methods based on deep neural networks are better than the text classification methods based on traditional machine learning on the premise that the classifier can be fully trained.Therefore,the structure and application of deep neural networks will be an important way to improve the performance of text classification and the main research direction in the field of text classification.The effect of text classification not only depends on the setting of classifiers but also on how to construct text features.For the discrete text data,constructing a specific and interpretable language model to obtain the embedding representation of text and improving the feature embedding representation method to improve the quality of text quantization are effective methods to improve the performance of classifiers indirectly.In the field of text classification,the existing text classification methods have achieved excellent performance by combining text quantization methods and text classifiers,so it is an effective way to improve the performance of text classification by improving the feature embedding representation methods and combining it with deep neural networks.Through the analysis of the application and process of text classification,the significance of text classification methods based on feature embedding representation is expounded,the specific research contents are as follows:(1)A novel multi-label text classification method that combines dynamic semantic representation model and deep neural network(DSRM-DNN)is proposed.DSRM-DNN utilizes word embedding model and clustering algorithm to select semantic words.Then the selected words are designated as the elements of DSRM-DNN and quantified by the weighted combination of word attributes.Finally,we construct a text classifier by combining deep belief network and backpropagation neural network and the low-frequency words and new words are re-expressed by the existing semantic words under sparse constraint.The performance of DSRM-DNN on RCV1-v2,Reuters-21578,EUR-Lex,and Bookmarks shows that DSRM-DNN outperforms the comparative methods.(2)A text classification framework combining character-level convolutional and generative adversarial networks(CCNN-GAN)is proposed.Texts are quantified by the character-level convolutional neural network(Char-level CNN),and then the textual features are input into the adversarial network and the classifier respectively.In the data augmentation module,the processed real-data are utilized to train the generator and the discriminator to make the generative distribution constantly fits the real-data distribution.The classifier is cooperatively incrementally trained by the real-data and the generated data.In this way,the problem of small samples can be solved and the consumption of text generation can be reduced.With extensive experimental validation on four public datasets,our method significantly performs better than the comparative methods.

Keywords/Search Tags:

text classification, deep belief network, sparse representation, character-level convolutional neural network, deep learning

PDF Full Text Request

Related items

1	Research On Text Classification Problem Based On Deep Learning
2	Polarimetric SAR Image Classification Based On Sparse Representation And Deep Bayesian Learning
3	Research On Text Classification Based On Hybrid Model Of Deep Learning
4	The Research Of Text Classification Based On Deep Learning
5	Research On Filtering Method For Uncivilized Text Based On Deep Learning
6	SMS Classification Methods Based On Deep Learning
7	Text Classification Of Human-Computer Interacting Words Based On Deep Learning
8	Research On Image Classification Methods Based On Deep Learning Models
9	A Study On Short Text Classification Based On Deep Learning
10	Research And Appliccation On Text Sentiment Classification Based Nn Deep Learning