Font Size: a A A

Design And Implementation Of Rainbow Retrieval System By Modeling And Exploiting Bibliographic Network

Posted on:2018-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2348330512990975Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of Mobile Internet and Cloud Computing technology,the amount of data generated,acquired,processed and stored in all areas of the society is exploding at exponential level.Big data,as a symbol of the development of the time,is influencing the society production and life in the forms of multivariate,polymorphism and interconnection.In academic area,the cumulative number of publications has reached the level of hundred millions,and huge amount publication data has caused great challenges on the traditional search technology.The traditional methods rank publications mainly through a single publication information,such as the relevance between the search terms and content or the citation count of the publications,without considering the relationships between the nodes in the academic network and the attributes of the nodes.So there exists the problems of poor correlation,deviation from the subject and bad retrieval quality in the results.In addition,the traditional academic retrieval system mainly provides document retrieval services,and in fact the field of authoritative expert recommendation plays an important role in guiding the research and development of researchers.Considering the massive academic data,how to excavate the deeper link structure semantic information and establish the expert retrieval system is also an important research topic.Under the background of Big Data,the development of data mining technology and distributed computing provides effective methods to solve the above problems.In this paper,we optimize the retrieval methods and application design of the retrieval system through constructing academic information network in two kinds of scenes,publication retrieval system and expert retrieval system.First,we use PageRank algorithm which is based on the link analysis to sort the importance of the nodes in the publication retrieval system.The following two aspects are improved for the performance flaw of PageRank algorithm:(1)Using the different attributes of nodes in academic information network to calculate the prestige of the publication nodes.Based on the prestige of publications,we optimize the weight distribution,strategy in PageRank algorithm and propose the SQT-rank algorithm to improve the sorting performance of the traditional algorithm;(2)Considering the huge amount of data under the background of big data,we use the MapReduce programming model in SQT-Rank algorithm for parallel processing to improve the computational performance of the algorithm.Furthermore,the heterogeneous information network contains more abundant link structure semantic information compared with the homogeneous information network.In expert retrieval system,in order to carry on deeper data mining and analysis,we first construct the academic heterogeneous information network,and extract the six relationship matrices between publications,experts and journals.Finally,based on the unified architecture of the publications,experts and journals,we propose the Mutual Reinforcement Expert Ranking(MR-Rank)algorithm to obtain more fair and reasonable expert sorting results.Finally,the architecture and function of the Rainbow Retrieval System based on academic network are realized on the basis of the above theoretical research.The entire system architecture includes data acquisition,data storage,data indexing,data analysis,and visualization of results.Through the data analysis and processing,we achieve the academic data extraction,cleaning,conversion,.and execute the publication or expert importance analysis and other functions.The rainbow retrieval system offers the ranking results to users in a specified way.In summary,this article aims at solving the problem of accurate publication retrieval and expert recommendation in special fields under the big data background.By constructing the homogeneous and heterogeneous academic network,the SQT-rank publication ranking algorithm and the MR-Rank expert ranking algorithm are used to mine the importance of nodes in the network.Then we further apply the rainbow retrieval system to recommend high quality publications and experts to improve the user's search experience.
Keywords/Search Tags:Academic Network, Publication Retrieval, Expert Recommendation, Heterogeneous Information Network, Data Mining
PDF Full Text Request
Related items