Font Size: a A A

Research On Opinion Object Extraction For Online Hot News Reviews

Posted on:2022-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y T LiFull Text:PDF
GTID:2518306731961929Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet has significantly increased the speed of information collection and dissemination.Users get a lot of news information from news websites and APP applications every day.When an event occurs in society,many media organizations will report news related to the event on the Internet,and a large number of news reports and reprints have formed a hot news event.Netizens will read the news through the Internet and express their opinions on the names of people,places,organizations,and events in the news.The field of opinion target extraction in Natural Language Processing(NLP)can effectively dig out which objects in the text users express their opinions.Research on the extraction of opinion target for Chinese hot news reviews can effectively provide basic technical support for subsequent public opinion analysis and understanding public opinion,and has a wide range of application values.Therefore,it is of great significance to conduct research on opinion target extraction for hot online news reviews.Opinion target extraction is a subtask in the field of opinion mining.At present,a lot of work has been done in the field of opinion target extraction at home and abroad,but there are still the following problems in the extraction of Internet hot news comment opinion targets:(1)The lack of corpus that can support the research on Internet hot news comment opinion target extraction.(2)There is a lot of noise in news comments,which leads to poor recognition of opinion targets.(3)Although the character-level opinion target extraction method has higher granularity than the word-level opinion target extraction method,the characters actually lack part of speech information and location information.In this paper,the research on the above problems has been carried out and the following results have been achieved:(1)Construct a corpus of news reviews.A certain number of news and corresponding comments have crawled from news websites by crawling methods as a corpus to support the research.(2)Propose a method of calculating the semantic similarity between news text and comments incorporating news title information.The semantic similarity calculation can effectively filter out the parts related to the news content in the comments,and improve the performance of the opinion target extraction model Focus on the large difference in the length of news text and comments,the calculation effect of the semantic similarity of the text is not good,and the BERT's ability to process ultra-long text sequences is not good.The calculation of the semantic similarity is converted to the calculation of the topic similarity of the two texts.On the basis of incorporating news title information,use the TextRank algorithm to convert the news text into short text.After passing the short text and comments into the BERT to obtain a semantic fusion representation,the topic model is introduced to calculate the topic distribution vector of the short text and the comment.Finally,the semantic fusion representation and topic distribution vector fusion are passed into the fully connected layer to calculate the probability of whether the comment is related to the news.The experimental results show that the method has a certain effect.(3)Propose a method of news comment opinion sentences recognition based on TextRank and BERT pre-training model.This article believes that opinion targets exist in opinion sentences,and comments can be further filtered through opinion sentence recognition,thereby effectively extracting opinion targets.Focus on the problems of traditional opinion sentence recognition methods that do not incorporate external information and BERT's poor ability to deal with long text sequences,it is proposed to integrate news summary information to improve the effect of opinion sentence recognition.First,use the TextRank algorithm to generate an automatic summary of the news,and then pass each comment of the news together with the news summary information to the BERT model to obtain the text fusion representation,and finally send it to the fully connected layer,and use the softmax function to convert the output of the fully connected layer into the probability of whether it is an opinion sentence.The experimental results show that the method has a certain effect.(4)Propose a method of news comment opinion targets extraction that incorporates part-of-speech-location features and dictionary features.Focus on the character-level opinion target extraction method,which ignores the part of speech information and location information of the words,firstly use news text to construct a dictionary,and secondly construct the part of speech of each character by processing the comment text,such as word segmentation,part-of-speech tagging,and location tag set tagging location features,using the n-gram method to take the character as the center,construct vocabulary according to the context of the character and match it with the dictionary,thereby constructing dictionary features.The character sequence,part-ofspeech-location feature and dictionary feature are fused and passed to the BiLSTMCRF network layer for sequence labeling.The experimental results show that the method has a certain effect.
Keywords/Search Tags:Semantic similarity, Opinion sentence recognition, Opinion target extraction, Text fusion representation, Feature engineering
PDF Full Text Request
Related items