Font Size: a A A

A Study On The Document Clustering With The Adjustment Of The Degree Of Relevance

Posted on:2019-09-17Degree:MasterType:Thesis
Country:ChinaCandidate:F L SunFull Text:PDF
GTID:2428330545959622Subject:Information Science
Abstract/Summary:PDF Full Text Request
Use the co-citation frequency between papers as the representation of the association strength calculation methods is reasonable and effective,which can be an effective method of literature analysis.Meanwhile,the method has been widely applied in the information science field.With the development of theory and application,people have improved and adjusted the index of measuring tightness according to the problems faced in the application process Multiple indicators can be used for a more comprehensive measure,such as the use of the number of co-citation,the same reference,the same author and the same key words four indicators to carry out a comprehensive measurement.However,when the four indexes are all the same,it is also weaker to reveal the degree of relevance,so the Jaccard index is introduced to further refinement of the index.In this way,we can make a more accurate measurement of the paper's tightness.Based on the number of co-citation,the number of references,the same number of authors,the same number of authors and the same key words as the basis,the specific practice is to introduce the cited Jaccard and the reference Jaccard,and consider the direct comprehensive index and the correlation degree of the two cases based on the sample based on the six indexes of the sample papers.On the basis of the correlation intensity expression or the expected value of each probability,the value of the correlation degree between the two methods of non sample papers can be obtained,and the corresponding paper structure is analyzed by the general processing process of co-citation analysis,and the structure relationship is compared with the relation degree before the adjustment of the method.
Keywords/Search Tags:Co-citation analysis, Jaccard index, Principal component analysis, Multiple regression analysis
PDF Full Text Request
Related items