Font Size: a A A

Citation Context Diversity Ranking Based On Academic Text

Posted on:2020-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y H YangFull Text:PDF
GTID:2428330599451492Subject:Information Science
Abstract/Summary:PDF Full Text Request
With the explosive growth of scientific literatures,the time costs for reading and citing papers are increasing.Researchers have proposed some search result diversification algorithms and designed citation recommendation systems to reduce the collection of papers that users need to read.However,there was no one concerned about how to improve the efficiency when users are reading or citing a specific paper.And authors of papers couldn't obtain more information than cited number and h-index(or some other index)of publication from the existing academic database.The author gets inspiration from the shopping behavior on the e-commerce website,which is the citation contexts of the cited paper can be considered as comments to support the user's reading and citing decision.Citation contexts are displayed only when users click “show context” button in the existing Cite Seer database.And they are ranked by citation number or citing time along with their other citation information,which couldn't satisfy the diverse need of users because of the redundancy of citation context.So,the comment value of citation context has not been tapped.The author understood the user needs of reading and citing a literature through interviews.And then,the author summarized three scenarios from perspective of readers and authors,recommended 10 diversified citation contexts for users under each scene.Data set in this paper consisted of cited papers and their citation contexts was filtered from Cite Seer database,which need to meet some conditions like,citation number of cited paper ranges from 50 to 100,publication of cited paper belongs to the collection of international academic conferences and journals recommended by CCF.The CCF recommended category(CCF-A,CCF-B,CCF-C and Other)were used to classify publication of citations.The author also randomly selected a thousand of citation contexts form data set to label the citation emotions,and divided the citation context into Negative,Neutral and Positive.For the readers,the author first selected 10 citation contexts from the content perspective,and then combined the classification of citation publication and citing time to re-rank them to complete the recommendation.For the authors,the author first classified the citation contexts according to the citation emotional trends or the citation publication,and then recommended several citation contexts from each category to recommend 10 citation contexts for users.As far as diversification strategy of the content,the author drew on strategies of search result diversification,and selected the best one form the nine algorithms combined with three semantic distance algorithms(Word Net,ESA and word2vec)and three implicit diversification algorithms(MMR,Score Difference and ILP).After the user case study of these nine strategies,the strategy “word2vec + ILP” was chosen as the best one for diversification on content.In the evaluation work,the methods of questionnaire survey were used to judge the performance of each recommendation list according to the scores of the questions under four indicators of “readability”,“diversity”,“usefulness” and “display rationality”.Comparing the citation contexts lists provided by Cite Seer which ranked by the citation number,the diversified lists recommended by this study has obtained a better evaluation result.
Keywords/Search Tags:Academic text, semantic distance algorithms, diverse reranking algorithms, user case, diversity
PDF Full Text Request
Related items