Font Size: a A A

Cross-Text Anaphora Resolution For Literature Authors

Posted on:2020-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z YuFull Text:PDF
GTID:2428330575478892Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the field of scientific research,searching for relevant information through literature authors is the main way of academic information retrieval.However,in all kinds of document management systems,the cross-text anaphora resolution of literature authors is very serious,which brings many difficulties to academic information retrieval.For example,how to eliminate the ambiguity between authors of the same name in different documents,namely name disambiguation.how to eliminate the problem of the inconsistency between authors in different documents,namely multi-name aggregation.In solving the problem of name disambiguation,the existing methods mainly classify authors through author cooperation relationship,homepage and mail.Since mail and homepage are difficult to obtain,how to accurately classify the author in the case where the information is unknown is the core problem that needs to be resolved.In addition,the names of literature authors are diverse,and there are even names of unknown forms.Therefore,how to implement multi-name aggregation in the case where the name variant is unknown is another problem faced by cross-text anaphora resolution.In view of above problems,the main research contents of this paper are as follows.(1)A name disambiguation algorithm based on network representation learning is proposed.The method can solve the problem of name disambiguation in the case that the author mail,homepage and other information are not available,and the number of specific categories of the author is unknown.First,the paper-author network is constructed by using the multilevel collaborators and the paper-author relationship.Second,the eigenvector representation of paper is obtained through the graph network.Finally,by using the relationship network between papers,the name disambiguation of literature authors is realized.(2)A multi-name aggregation algorithm based on feature similarity is proposed.This method can realize multiple aggregation of Chinese literature authors by analyzing the feature similarity between papers only given the author name.First,for a given author,a collection of author name variations is constructed.Second,construct the paper keywords and analyze the similarity of different features between papers,so as to achieve multipleaggregation of authors.(3)A multi-name aggregation algorithm based on supervised learning is proposed.This method divides multiple aggregation problems into two sub-problems.paper title matching and paper author matching.First,for a given author name,obtain a paper from the Baidu Academic Scholars Channel.Second,using the BLEU algorithm to analyze the similarity between the paper-paper and the author-author,find the name variants corresponding to the authors,and achieve multi-name aggregation.(4)Verify the effectiveness of proposed algorithms.The validity of proposed algorithms in name disambiguation and multi-name aggregation are verified by the benchmark data set published on Aminer and the real data set collected manually.In addition,the proposed algorithm has been applied to the Academic Headline APP(http://www.acheadline.com/).
Keywords/Search Tags:Information Retrieval, Cross-Text Anaphora Resolution, Name Disambiguation, Multi-name Aggregation
PDF Full Text Request
Related items