Font Size: a A A

Research On Algorithms And Applications For Sentiment Lexicon Construction

Posted on:2020-05-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:D DengFull Text:PDF
GTID:1368330578976888Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of mobile internet,delivering or sharing messages online has become a common activity in our daily life.Although such data is noisy and mas-sive,it contains a lot of sentiment or opinions to events or services which are important for us to understand customers' preference.These characters make such data attrac-tive and valuable for companies or governments.Sentiment analysis,as an important component in natural language processing,contains many tasks,such as document-level,paragraph-level and sentence-level sentiment classification and so on.However,with the rapid growth of unsupervised text and the cost of data annotations,it's more and more important for us to research lexicon based sentiment classification tasks which can deal with massive data rapidly.Thus,sentiment lexicon construction is a very hot research topic in sentiment analysis.Traditional approaches for constructing sentiment lexicons usually depend on the manually annotated data or existing semantic networks.They usually have high precision,but low recall and high cost.Furthermore,these rapid growing data,including long text(e.g.news,wiki pages)and short text(such as microbiog,product reviews,tweets and so on),is more and more massive.Thus,developing sentiment lexicon construction methods is necessary and important.In this paper,we research the progress of sentiment lexicon construction and apply such lexicons in review based rating prediction tasks.The content of this paper is shown as follows:(1)Aiming at the problem that the sentiment polarities of words may change in the different domains or topics,we propose a topic detection based sentiment lexicon construction(TDSLC)algorithm.The proposed method introduces a new latent factor,sentiment,and model the documents and words based on the joint distribu-tion of each pair of topic and sentiment Furthermore,we apply the hierarchical supervision information of documents and words to the hyper-parameter which can ensure that the sentiment polarity of each word is as close as that of the documents.The experimental results show that considering the variability of words' sentiment polarities in different topics can improve the accuracy of lexicon based sentiment classification tasks on movie review and twitter dataset.(2)Aiming at the problem that most deep learning based methods to construct sen-timent lexicons do not consider the observation that different words usually con-tribute different to the distinguish of documents' sentiment polarities.Thus,we propose a sparse self-attention neural network for sentiment lexicon eonstruction(SSANNSLC).This approach uses the self-attention mechanism to calculate the weights of words within a document.Meanwhile,a L1 regularize is applied to these weights so as to ensure the intuitive facts that only a minority of words among a document have sentiments.The experimental results show that SSANNSLC effi-ciently extracts those important or sentiment words and achieves the state-of-the-art performance.(3)Aiming at the problem that most sentiment lexicon construction methods do not consider the effect of the position of words in a document,an automatical position-sensitive sentiment lexicon construction algorithm(APSSLC)is proposed in this section.In the natural language,we are used to exhibit the conclusion or the sen-timent polarity in the end of the document.Meanwhile,when a sentiment word occurs in the end of the document,this word usually plays an important role in distinguishing the sentiment polarity of the whole document.APSSLC maps each word into two vectors which respectively represent the semantic and sentiment,meanwhile,APSSL also maps each position into a low dimension vector which expects to capture the position information.The experimental results in sentiment classification tasks prove that the position information has a positive effect on the sentiment lexicon construction.(4)Sentiment analysis has been applied to review based rating prediction task in rec-ommend system.In this paper,we apply the sentiment lexicons into the rating prediction tasks in recommend system.We firstly propose a neural gaussian mix-ture model(NGMM)for review based rating prediction tasks.It can efficiently extract information from historical reviews and apply gaussian mixture model to the ratings.Then,we use the sentiment lexicon to obtain each word's sentiment polarity and input them into the above neural gaussian mixture model,this model is called,sentiment lexicon based neural gaussian mixture model(SLNGMM).By comparing the performance of RNGMM,SLNGMM with other baselines,we can observe that the sentiment lexicon is very beneficial to the rating prediction tasks in recommend system.
Keywords/Search Tags:sentiment lexicon, topic detection, topic model, deep learning, self-attention mechanism, recommend system, rating prediction, text analysis, sentiment analysis, opinion mining
PDF Full Text Request
Related items