Font Size: a A A

Research On Approaches For Cross-lingual Sentiment Analysis

Posted on:2018-02-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q ChenFull Text:PDF
GTID:1318330512986009Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Recently,with the development of natural language processing and machine learning technology,sentiment analysis has been well studied and developed.Sentiment analysis in word-,aspect-,phrase-,sentence-and document-level have been gradually mature.However,as a special sentiment analysis problem,cross-lingual sentiment analysis(CLSA)still preform unsatisfactorily.The main reason is that the disjoint feature spaces of different languages greatly obstacle the sentiment resource or sentiment knowledge sharing from source language to the target language.To alleviate the undependable knowledge sharing,the focus is to study the difference between the patterns of cross-lingual sentiment representations.Thus,how to build the dependable knowledge-transfer channel or credible sentiment knowledge connection to reduce the difference is still a bottleneck to solve cross-lingual sentiment analysis problems.Although CLSA draws much attention and has been achieved great development recently,it still faces some challenges.To solve the problems in the CLSA,we discuss the generality of the CLSA research,the sentment knowledge transfer and adaptation across languages;and then,we discuss the specialization of the CLSA problem by studying the discrepancy and correlations between sentiment-expressions of different langauges.The work in this paper is described as follows:1.Using Cross-sampling and Integrating Structural Sentiment InformationBased on co-training,we propose a mutual-learning framework for cross-lingual sentiment analysis based on a cross-sampling strategy and the structural sentimental information.Firstly,we use a heuristic method to extract sentimental expressions from training data,and then we join them into n-gram features to form a highly sentiment-expressive feature space.Subsequently,we integrate into traditional co-training framework with a cross-sampling strategy to mutually learn the sentimental knowledge from unlabeled data in the both two languages.During the learning,sentimental knowledge from different languages are mutually fused to each other language.Finally,we can learn a sentiment classifier in the source language with our proposed framework.The experimental result shows that our proposed method can efficiently leverage a small scale of a labeled data and massive unlabeled data in the both languages to get a more dependable and high-quality sentiment classifier in the target language comparing to existing CLSA methods.2.Learning to Adapt Credible Knowledge During Knowledge TransferringCross-lingual sentiment analysis is a task of identifying sentiment polarities of texts in a low-resource language by using sentiment knowledge in a resource-abundant language.While most existing approaches are driven by transfer learning,their performance does not reach to a promising level due to the transferred errors.In this paper,we propose to integrate into knowledge transfer a knowledge validation model,which aims to prevent the negative influence from the wrong knowledge by distinguishing highly credible knowledge.Experiment results demonstrate the necessity and effectiveness of the model.3.Capturing Cross-Lingual Sentiment RelationsSentiment connection is the basis of cross-lingual sentiment analysis(CSLA)solutions.Most of existing work mainly focus on general semantic connection that the misleading information caused by non-sentimental semantics probably lead to relatively low efficiency.In this paper,we propose to capture the document-level sentiment connection across languages(called cross-lingual sentiment relation)for CLSA in a joint two-view convolutional neural networks(CNNs),namely Bi-View CNN.Inspired by relation embedding learning,we first project the extracted parallel sentiments into a bilingual sentiment relation space,then capture the relation by subtracting them with an error tolerance.The bilingual sentiment relation considered in this paper is the shared sentiment polarity between two parallel texts.Experiments conducted on a benchmark dataset suggest the effectiveness and efficiency of the proposed approach.4.Learning Intrinsic Sentiment-Expression Discrepancies between LanguagesDifferent languages express sentiments in different patterns.Discovering correlations between cross-lingual sentiment expression patterns is the key to cross-lingual sentiment analysis.This paper explores robust bilingual polarity correlations(RBPCs)which are defined as intrinsic sentiment expression discrepancies between languages.We aim to learn RBPCs by modeling the discrepancies over labeled training data and use the learned RBPCs to identify the polarity of test data.We propose two relation-based bilingual sentiment translation models to learn RBPCs,both translating parallel sentiments across languages.Experiments conducted on a benchmark dataset demonstrate effectiveness and the necessity of the proposed models.
Keywords/Search Tags:Sentiment Knowledge Sharing, Cross-sampling, Structural Sentiment Information, Knowledge Validation, Cross-Lingual Sentiment Relation, Intrinsic Sentiment-Expression Discrepancies
PDF Full Text Request
Related items