Font Size: a A A

The Research And Implementation Of Information Cartography In Scientific Research Domain

Posted on:2019-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:T M MaFull Text:PDF
GTID:2348330545955632Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The main task of information retrieval in scientific research field is to select and rank the information sources and locate related information accurately from the ocean of scientific literature according to users'description of some scientific paper.This paper blazed a trail from the traditional information retrieval systems and proposed a novel information cartography suitable for scientific research scenarios,which is able to display the retrieved information with an information map.Such information map can not only convey the results of traditional information retrieval,but also emphasize the complicated transmission of information among the retrieved results.The design of information cartography in this paper consists of three components:representation learning for scientific literature,information network generation and information network optimization.To integrate multiple abundant information sources and fuse heterogenous information into an identical representation vector space,this paper adopted a Skip-gram-model-based paragraph vector representation algorithm and DeepWalk,the network representation learning algorithm.Two types of mixed-embedding-based document representation learning methods were proposed:Mixed Vector Representation with Semantic Link(MVRSL)model and a Mixed Vector Representation with Pretrained Embedding(MVRPE)model,combining the above mentioned two algorithms.This paper proved the validation and efficiency of MVRTL and MVRPE with experimental data.Both of them showed 5%-10%higher accuracy than compared algorithms on classification tasks and link prediction tasks.In the process of generating the information network starting from the target document,this paper described a scheme which iteratively includes strongest related documents and records the linkage among them utilizing the learned syncretic representation of documents and a link predictor.Subsequently,to optimize the readability of the generated network,this paper proposed the Degree-Centrality-based Link Importance Metrics(DLIM).With this metrics,it is guaranteed that the core of the generated network can be obtained with the loss of network connectivity restricted within 5%.Finally,this paper developed a pilot information map retrieval system which deploys the proposed information cartography in this paper and is able to convert the obtained information network into a charted information map.
Keywords/Search Tags:information retrieval, natural language processing, neural network, representation learning, link prediction
PDF Full Text Request
Related items