Research On Automatic Text Summarization Based On User Comments

Posted on:2022-09-19

Degree:Master

Type:Thesis

Country:China

Candidate:M Yuan

Full Text:PDF

GTID:2518306350489844

Subject:Management Science and Engineering

Abstract/Summary:

PDF Full Text Request

With the development of the Internet and e-commerce,more and more people are booking hotels through online platforms,thus a large number of reviews have also been generated.How to effectively discover valuable information from massive data has become a challenge.Researchers have proposed many technologies to solve these problems,and automatic text summarization is one of them.This article aims to use automatic text summarization technology to mine the crucial information in reviews,solve the problem of hotel review information overload,provide consumers with references,and put forward suggestions for hotel managers.At the same time,the reviews will contain fake reviews,therefore the fake reviews must be identified before the automatic summary of the reviews,and the interference of the fake reviews must be excluded.At present,the relevant public data sets are mainly in English,which increases the difficulty of domestic real-world applications.Consequently,the research selects real-world Chinese hotel review data to obtain the data set.The research selects reviews of 8 Piao HOME hotel chains on Ctrip.com as the research object.The work includes two parts: fake review identification and automatic text summarization.The identification of fake reviews is the preliminary work to ensure the authenticity of the summary.Fake reviews are identified using the classification method of supervised learning.The tasks include: First of all,obtaining,cleaning,and organizing data,and deleting irrelevant reviews.Then the identification features are selected as the basis of artificial labeling,and the artificial labeling of false comments is used to identify the data set.Then use the Bert language model is used to represent the text,and the Bert model is used to train,verify and test the classification model,evaluate the model,and predict the unlabeled data.At last,keep the true reviews and eliminate the fake ones.The task of automatic text summarization includes:Firstly,in view of the short and irregular syntax of Chinese reviews,the real reviews are expressed in fine-grained clauses and text.Then,K-Means clustering algorithm is used to extract summary sentences,and the summary of comments is finally formed.Finally,in order to judge the summary and supplement the review summary,the Text Rank algorithm is used to extract keywords.Before keyword extraction,the review is segmented,the stop words are removed,and the part-of-speech are marked on the basis of the original data processing jobs.Synthesizing the results of keywords and review abstracts,the study draws the following conclusions: Firstly,irrelevant review occupies a considerable proportion;Secondly,the fake review identification features can ensure the consistency and objectivity of the manual annotation of the data set.Thirdly,the evaluation index of the fake review recognition model is at high levels,indicating that the method is effective for the fake review recognition in this research,so the model can be applied to the fake review detection of newly added reviews;fourthly,the keywords and review summary are basically the same in content,verifying the summary;Finally,the advantages and problems of each chain hotel are found,and they have high similarities.In the end,several suggestions were put forward in response to the problems in the hotel.The research has achieved good results and can solve the problem of information overload.The whole process is also applicable to the automatic summarization of reviews of other hotels.It is an application practice of the Chinese multi-document short text summarization method for specific cases.

Keywords/Search Tags:

hotel review, deceptive review, text summarization, BERT, K-Means

PDF Full Text Request

Related items

1	Research On Deceptive Review Detection Method Based On Topic Sentiment Model
2	Research On Review Summary Generation Based On Text Summarization
3	Design And Implementation Of Book Review Emotion Analysis System Based On Bert Model
4	Research On Recommendation Algorithm Based On Review Text
5	Research On Deceptive Review Detection Based On Multi-view Learning
6	Automatic Summarization Of Book Review Based On Multi Source
7	Detecting Review Spammers Based On Review Feature
8	Design And Implementation Of E-Commerce Review Mining System Based On BERT
9	Research On Spam Review Filtering And Its Application In Review Management System Of Scratch Productions
10	Research On The Influence Of Psychological Distance On Hotel Online Review