Font Size: a A A

Research On Text Sentiment Classification Of Hotel Field

Posted on:2018-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:K Y ChenFull Text:PDF
GTID:2348330533461376Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of social networks in recent years,users comment with various sentiment on lots of things in the network become more and more frequent,the comments which grow explosively of the number contain the user’s subjective information.It is important for product improvement,public opinion monitoring and product recommendations and so on by mining the subjective information and extract the valuable information.Text sentiment classification has become a hot research topic.Text sentiment classification use unsupervised classification method and supervised classification method.In this paper,introduce the key steps and related technologies of the two classification method and do research on hotel field’ text sentiment classification with the two methods.Based on sentiment lexicon of unsupervised classification method,there are problems which the sentiment words are not enough covered in the sentiment lexicon and the consideration of sentiment l tendency value calculation method isn’t comprehensive.In supervised sentiment classification method,there is a problem that some useful characteristics cannot be identified in feature extraction stage,when calculating the feature weighting,the traditional calculation method has some deficiencies.To solve these problems,the main work include the following:1.Build a hotel field lexicon and propose sentiment tendency value calculation method.First,combine the Hownet lexicon,NTUSD lexicon and the commendatory-derogatory lexicon,and continue to expand the lexicon with using the word2 vec tools to form a relatively complete sentiment lexicon of hotel field.At the same time,considering the special circumstances,such as negative words and summing-up words,match the sentiment words and special words with sentiment lexicon and calculate the sentiment tendency value by this paper’s method.Divide the hotel review text into positive and negative sentiment categories according to the sentiment tendency.2.Propose the special feature-selection strategies and improved feature weighting algorithm.Feature-selection tend to select word rather than phrase,ignore a series of special circumstances such as negative word and noun with sentiment.Classic TF-IDF(term frequency–inverse document frequency)weighting methods only consider the influence of word frequency and inverse document frequency,and does not take into account the internal distribution of feature terms in classification.Based on the classic method of feature-selection and feature-weighting calculative methods method to calculate the TF-IDF,expanded feature set by special feature-selection strategy,combined the position information in the comment of feature terms,propose PW-TF-IDF algorithm.Using support vector machine(SVM)as the classifier to divide the hotel review texts according to the sentiment tendency.Do experiments with the two different kinds of text classification method in the field of hotel review data sets,both show good results,preliminary verify the feasibility of the two methods.
Keywords/Search Tags:sentiment lexicon, TF-IDF, hotel review, word2vec, sentiment classification
PDF Full Text Request
Related items