Font Size: a A A

Research On The Recognition Method Of Quotations In Academic Papers

Posted on:2019-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:J H ZhaoFull Text:PDF
GTID:2428330566460541Subject:Information Science
Abstract/Summary:PDF Full Text Request
With the increasing perfection of the electronic database,the full-text of the formatted literature is available,and the analysis of the content of the citation based on the full-text becomes possible.At the same time,with the rapid development of computer technology such as Natural Language Processing,the automatic citation processing is realized by computer.The literature research based on Citation content has become a new development direction of citation analysis.However,due to the lack of a perfect citation database,the literature research based on citation content is still in the initial stage of exploration.There are few related studies on Chinese literature.Therefore,this paper selects Chinese academic papers as the research object,and finds the citation sources of each citation by calculating the similarity between the citation content and the full text sentences of the cited documents,so as to identify the sentences with strong academic influence in the citation,namely the quoted sentences.This paper first introduces the theory and research status of the content analysis of the citation,the basic concepts and the calculation steps of the text similarity.Then the basic ideas and advantages and disadvantages of the two similarity algorithms adopted in this paper are introduced.Then it introduces the design and the concrete realization module of the experimental scheme of the speech recognition.Finally,taking the research field of network literature as an example,50 highly cited academic papers and citation documents in this field were selected as experimental data sets,and the results of experiments and experiments were analyzed.The analysis found that:(1)the text similarity algorithm based on two models is feasible and effective in identifying the quotations.Compared with the complex similarity algorithm based on LSI,the similarity algorithm based on VSM is simpler and more stable.(2)there is a highly nonlinear positive correlation between the k value in the LSI algorithm and the number of feature items constructed by the participation matrix,and when the value of k is 300,the LSI algorithm effect of the 90% sample set can achieve the best.In this study,the method of text similarity calculation has been applied to the analysis of the content of the citation,and it has provided a new perspective for identifying the quotations of academic papers.
Keywords/Search Tags:Quoted sentence recognition, Citation analysis, Text similarity, Academic papers
PDF Full Text Request
Related items