Font Size: a A A

Research On Short Text Classification Method Based On Feature Extension

Posted on:2022-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:C H YangFull Text:PDF
GTID:2518306743474114Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the era of big data,a large amount of short text information is generated every day,and most of such information appears in the form of user reviews,online news,Q&A,etc.Facing the huge amount of short text.How to classify them effectively is of great significance for news push,mining user preferences,user sentiment analysis and other application fields.However,short texts often have word limit,little information content,sparse features,etc.Traditional text classification methods cannot get better classification results on short texts,so how to accurately classify short text is a research difficulty and focus,This dissertation mainly focuses on the classification of short texts,and proposes two classification methods based on sentiment and category.First,this dissertation proposes a sentiment classification method for short texts based on the expansion of neologic features.For the problem of short text with little content information and sparse features,this dissertation applies a new word discovery algorithm based on information entropy and mutual information,and uses a conditional random field and a near-sense lexicon to extend these new words,thus enriching the words in the original short text that can be used for sentiment analysis and realizing the feature extension for the original short text,and then uses Bi LSTM and attention mechanism to extract features from the extended Then we use Bi LSTM and attention mechanism to extract features from the extended short text to complete the sentiment-based classification task.The proposed algorithm can achieve good classification results through experimental verification.Second,this dissertation investigates the category-based classification problem for short texts,and proposes a category classification method for short texts that incorporates topic features.The method takes into account both text feature information and topic feature information of the short text itself,and accomplishes the category classification task for short texts.First,the method obtains the topic features of the short text through the LDA topic model,then obtains the text features of the short text through Bi LSTM-CNN-Attention,then fuses them with the obtained topic features to achieve the purpose of short text feature expansion,and finally,the category classification is achieved according to the expanded features.In this dissertation,the feasibility of the method is verified through experiments.
Keywords/Search Tags:Short text, Text classification, Attention mechanism, Deep learning, Feature extension
PDF Full Text Request
Related items