Font Size: a A A

Short Text Classification Based On Feature Extension

Posted on:2019-11-12Degree:MasterType:Thesis
Country:ChinaCandidate:S SongFull Text:PDF
GTID:2428330563458510Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of major social networking platforms such as Weibo,WeChat,and various e-commerce platforms such as Taobao and Jingdong,short texts have received more and more attention as a carrier of information.It is a key issue in natural language processing that how to get structured digital features from abstract text features and classify their intrinsic meanings.In this paper,we designed a method based on deep learning to study how to extract effective text features and feature extraction to improve the classification effect.First of all,this article introduces the detailed flow of short text classification.It briefly introduces several common methods before the introduction of deep learning method for each section.Based on this,it outlines the advantages of deep learning methods in dealing with short text classification problems,and this paper also described the depth models commonly used for short text classification,and analyzed the characteristics of each model.For the follow-up of the method proposed in this article laid the foundation for the study.Secondly,in order to get a better classification result,this paper presents a method of feature extraction based on Convolutional Neural Network(CNN)and Recurrent Neural Networks(RNN).In this method,the input layer first maps the words in the text to the word vectors,and then performs feature extraction through the CNN and RNN respectively.The two features are weighted and combined,and are input together into the feature fusion layer,and finally classified.This paper compares the experiments of single feature extraction and joint feature extraction on seven data sets.In addition,the effect of network parameter settings on the performance of the model is also explored.Finally,it is processed on two common data sets and in recent years.The method of short text classification task is compared.The results show the effectiveness of the proposed method.Thirdly,aiming at the problem of emotional classification in short text classification,this paper proposes an emotion feature representation method based on frequency-inverse document frequency,and combines two feature representation dictionaries with principal component analysis algorithm.Based on the original semantic features,this method adds emotional tendency features for subsequent feature extraction and classification.Through comparison experiments,classification tasks were performed on the three datasets using different feature extraction methods respectively.Finally,the classification accuracy of the algorithms that used the emotion feature representation exceeded the classification accuracy of the semantic features only.The results show the availability of the proposed method.
Keywords/Search Tags:Short Text Classification, Feature Extraction, Feature Representation, Deep Learning
PDF Full Text Request
Related items