Font Size: a A A

Representation Learning Based Word Embedding Extraction And Its Application On Sentiment Analysis

Posted on:2020-11-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:M ZhangFull Text:PDF
GTID:1368330578464009Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Text sentiment analysis utilizes computer technologies to detect,label,classify or extract subjective contents from natural language texts,aiming to determine the sentiment polarities.Word representation can be viewed as its premise,the principle problem of which is to extract and analyze semantic knowledge from unstructured data and deeply interpret the semantic and syntactic relationships between words from the view of mathematics,in order to establish an information transferring mechanism between human beings and machines.Since the online social texts develop towards multi-sources,multitopics and informal expressions,the embedded redundancy and dynamics substantially add the burden of semantic abstraction and extraction,making sentiment analysis more complicated and challenging.This subject starts from refining contexts and optimize the state-of-the-art word embedding models according to the inner properties of online data,with the purpose of solving fine-grained sentiment analysis problems.The main research is presented as follows:(1)In order to solve the imbalance problem of context distance distribution,this work proposes a word embedding extraction method based on salient features.Aiming to maintain the realities and reliability of text information,a semantic relatedness principle is designed from the view of word distance,on the basis of which a rareness standard of contexts is established according to the context distribution and a sequence of salient features for a targeted word is further determined.This method can overcome the ambiguities,disorder and noise of text data and has the advantages of representing global context information.The experimental results show that this method can significantly improve the performance of current models on semantic similarity tasks.(2)In order to solve the instability problem of context position distribution,this work proposes a word embedding extraction method based on refined contexts.Targeting on adaptively selecting the contexts with different distances and position variations,this method strengthens the scaling effect of context distance,especially on distant contexts.It also derives a distributional measure of contexts based on statistical position variations,especially for increasing the effects of those contexts that frequently occur at fixed positions or equally appear in the word windows.This method can improve the interpretation ability of contexts,and the experimental results show that it exhibits strong flexibility and self-adaptability.(3)In order to reduce the complicated steps employed in the traditional deep learning models regarding to aspect level sentiment analysis,this work proposes an attention based word embedding method.For the purpose of eliminating the fuzziness of polysemy and the ambiguity of antonyms,this method specifically designs an attention vector containing two sub-vectors: one is dimensional attention sub-vector for measuring the relatedness between dimensions and topics,and the other one is sentimental attention sub-vector for determining the sentiment significance of words.Meanwhile,this work further proposes a cellular automata based artificial bee colony for the optimization of the attention vector,which can be directly used as the input of convolutional neural network for solving aspect level sentiment classification problems,without the modifications of model structure.Thus,this method has strong advantages of adaptation and universality and shows great superiorities when compared with other models.Based on the above analysis,this work mainly concentrates on word embedding extraction and sentiment analysis.On one hand,it deeply exploits the necessities of refining contexts from the views of context distance and position variation.On the other hand,it builds the connections between spatial dimensions and word meanings based on the semantic characteristics of words.Finally,I do sincerely wish that this work could provide certain innovations and values for the following research in the field of word representation and sentiment analysis.
Keywords/Search Tags:sentiment analysis, representation learning, word embedding extraction, aspect level sentiment classification, deep learning
PDF Full Text Request
Related items