Font Size: a A A

Research On Citation Sentiment Analysis Based On Semantics In Citation Context And Its Application

Posted on:2021-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:D T LengFull Text:PDF
GTID:2428330605964556Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Citation analysis is an important research topic in the field of scientific and technological evaluation and management.Compared with the traditional citation analysis method which only considers the citations frequency,citation content analysis can effectively discover the valuable citation information such as semantic association and emotional inclination in the citation content,so as to judge the citation value more comprehensively.However,the sentiment recognition of citation context is much more complex than the sentiment analysis of traditional text(such as micro-blog text).The citation context has its own characteristics.For example,when citations occur,the author usually implicitly expresses his/her emotions,especially his/her negative emotions.This greatly increases the difficulty of sentiment recognition of cited context.Meanwhile,with the development of citation content analysis,the scientific summarization of a single paper based on citation content has been paid more and more attention by researchers.It can clarify the contribution of the citation to the scientific community from the perspective of the application of the citation.Unfortunately,the current research on citation summarization has not yet considered the attitudes of the citers.Due to the implicit nature of citation sentiment,especially negative citation sentiment,existing citation-based summarization shows more neutral and positive evaluation however,which can result in biased summaries.In order to solve the above problems,this paper first studies the sentiment classification of citation context,and then improves the performance of sentiment classification of citation context by capturing the linguistic patterns used by the authors to express the emotions in the citation context.Next,on the basis of sentiment classification of citation context,summaries are generated separately for each kind of sentimental citation context,and then combined with each kind of sentimental citation summaries to generate a comprehensive citation summary,so as to better illustrate the contribution and value of the cited document in the scientific community.The specific research contents are as follows:(1)Sentiment Classification based on Linguistic Patterns in Citation Context.This study explores the linguistic patterns of emotional expression in citation context,and on this basis,recognizes the sentimental polarity of citation context.Conditional random field(CRF)model is introduced to annotate the logical relationship between syntactic structure and vocabularies in linguistic patterns.By analyzing the effect of the generated CRF templates in classifying the subjective/objective sentences and the positive/negative emotional polarity in citation context,the role of linguistic patterns in classifying the citation sentiment is discussed.The experimental results show that the CRF model based on linguistic patterns is superior to the commonly used SVM model in both subjective/objective and emotional polarity classification tasks.Although in the SVM model,the contextual information of citation context is considered by introducing one deep learning model of word2vec.It shows that extracting linguistic patterns from the citation context really helps to reflect the way in which the author organizes his/her language in expressing his/her emotions.Extracting these linguistic patterns help to improve the performance of sentiment classification of the citation context.(2)Scientific Paper Summarization based Citation Sentiment.Based on the sentiment classification of citation context,this study generates summaries for the collection of positive,negative and neutral citations of the target paper,and combines the citation summaries of various sentiments to form a comprehensive summary of the document,that is,faceted summaries,to clarify the contribution and value of the target paper in the scientific community.The BERT pre-training model is used to process the citation fragments and generate text vectors that take into account the context semantic information of the citation fragments.In order to describe the membership relationship between objects and clusters more reasonably when clustering,the vectorized citation fragments are classified using the Fuzzy-C-Means clustering algorithm based on flexible partitioning.Finally,the method of combining LexRank and MMR is used to select the summary content,so that the final generated summary content takes into account both importance and diversity.The experimental results show that the summary algorithm used in this study is superior to the baseline method in three aspects:text vectorization representation,clustering,summary content selection,and improves the performance of summary generation technically.At the same time,from the content of the summaries,the sentimental-based faceted citation summary can better summarize the advantages and disadvantages of the target paper than the general citation summary,which is more conducive to revealing the application value of the target paper.
Keywords/Search Tags:citation sentiment classification, citation summarization, linguistic patterns, Fuzzy-C-Means
PDF Full Text Request
Related items