Font Size: a A A

The Research On Short Text Mining With Conditional Random Fields And Improved LSTM

Posted on:2020-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:X L GuFull Text:PDF
GTID:2428330596485772Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of Internet technology,such as Weibo,QQ,WeChat and other social platforms,Jingdong,Taobao and other online shopping platforms,it can be said that the Internet products are endless.In daily life,people's increasingly frequent online behaviors make the published comments form a large amount of short text data scattered on the Internet.Exploring the rich emotions and attitudes of publishers in these short text data through effective means has important and clear guiding significance for public opinion monitoring of government departments,sellers to develop marketing strategies and buyers to make purchase decisions.Short text data tends to be characterized by shorter length,sparse contextual features,and colloquial language expression.This leads to the problem of insufficient use of context information and inaccurate use of grammar when analyzing data,which brings great challenges to short text data mining.In recent years,conditional random field and deep learning models have become more and more widely used in the fields of image processing,text mining and personalized recommendation systems.The conditional random field is a model based on conditional probability distribution,which overcomes the common mark deviation problem when labeling data,and can effectively extract relevant information such as evaluation objects contained in the comment text,while the deep learning model can be actively under weak supervision.Learning the sentiment orientation information contained in the review texts determines that the two models are receiving more and more attention in the field of short text mining.Because of the emotional tendency of the review text,it is closely related to the evaluation object in the text.Based on the conditional random field and deep learning model,this dissertation proposes a short text mining method for two aspects of short text evaluation object recognition and sentiment analysis.The main work of this dissertation is:(1)In view of the sparseness of short text data context and the lack of accuracy in grammar application,grammatical features are difficult to play a role.This dissertation proposes an evaluation object recognition method based on word features and semantic features.The method introduces semantic features in the conditional random field model,and captures the structure of the comment text in the form of “agent + adjective emotional word” and “verb emotional word + victim”,and then converts the feature into The feature function trains a specific conditional random field model,and finally combines the semantic features with other different types of features,respectively trains the corresponding conditional random field models corresponding to the respective combinations,and finds the combination features with the best recognition effect according to the recognition effect of the model.By experimenting with the hotel commentary corpus and mobile phone commentary corpus,compared with the introduction of grammatical features,the recognition effect of semantic features is improved in accuracy(P),recall rate(R)and F value,which proves the introduction of semantic features.The effectiveness of the combination of word features and semantic features is best achieved.(2)Because the short text context features are sparse,and each word in the sentence has different effects on emotional polarity,this dissertation proposes a short text sentiment analysis method based on Attention-BiLSTM model.The method uses the standard LSTM model to model the positive and negative directions of the sentence,and introduces the Attention mechanism to give higher weight to the more important words in the sentence.Considering that different evaluation objects in sentences may correspond to different emotional polarities,this dissertation integrates the evaluation object information before the hidden layer vector input Attention layer,and further improves the model.By experimenting with the restaurant review corpus in SemEval 2014 Task4,the proposed model achieved higher accuracy than the LSTM,BiLSTM and TDLSTM models.
Keywords/Search Tags:evaluation object identification, conditional random fields, sentiment analysis, LSTM, attention mechanism
PDF Full Text Request
Related items