Font Size: a A A

A Study On Hierarchical Text Representation And Sentiment Classification

Posted on:2020-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:J Y HuFull Text:PDF
GTID:2428330572974169Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet,the amount of data generated has exploded in a short time.As an unstructured or semi-structured information carrier,text has be-come an important part of Internet content.How to mine and discover valuable infor-mation more effectively as well as make rational utilization has been a big challenge in the current field of information science and technology.This thesis focuses on the tasks of the sentiment classification which requires mak-ing categorize according to the emotional tendency of the whole text or making the evaluation of comments on website for 1 to 5 star.In order to achieve good classifica-tion accuracy,the key points are making good text representations and identifying the positive,negative and neutral expression of the contents with their sentiment intensity.However,current models usually ignore the composition structure of documents,and have shortcomings on the quality of text representations and the attention to sentiment contents.Here,we discuss how to improve the sentiment classification performance from the two aspects of the text representation and the attention to sentiment contents.The main works and contributions are as follows:(1)We proposes a hierarchical text representation method with central constraint for improving the text general representation.Our Central Constraint Hierarchical At-tention Network(CCHAN)firstly uses bidirectional GRU to encode words repre-sentations and obtain the sentence vectors by weighted summation with attention mechanism.Then we repeat similar operations for sentence vectors to obtain the document representation.In CCHAN,we use the central constraint loss to generate text represents which makes high cohesion virtue of the same class in their vectorial representation space.The experimental results show that the hierarchical represen-tation can improve the training speed of the model by about 35%,while the central constraint loss can reduce the Root-Mean-Square Error(RMSE)of the classifica-tion results by about 8%.In addition,the accuracy of sentiment classification also proves the validity of CCHAN.(2)In order to identify and focus on the expression with strong sentiment,we proposes a text sentiment classification method based on the attention to sentiment contents.In Hierarchical Sentiment Attention Network(HSAN),we design a sentiment eval-uation auxiliary network which is used to evaluate word sentiment information in context.In addition,we designs a stage-by-stage joint loss for training the classi-fier and the sentiment evaluation auxiliary network together for further adjusting attention weights according to sentiment evaluation.The words sentiment evalua-tions and the visualization of the attention weights distribution shows that HSAN can recognize the sentiment content and increase its contribution to the document representation,which finally improves the accuracy of sentiment classification.In this thesis,we make experiments on four real public sentiment classification datasets including Yelp 2013,Yelp 2014,Yelp 2015 and IMDB.Experiment results show that the proposed models outperform other recent excellent models and can well com-plete the tasks of text representation and sentiment classification.
Keywords/Search Tags:Sentiment Classification, Sentiment Computing, Text Representation, Attention Mechanism, Deep Neural Network
PDF Full Text Request
Related items