Font Size: a A A

Research On Multi-dimensional Aggregation Of Sci-tech Literature Based On Content Deep Discovery

Posted on:2021-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:J L SongFull Text:PDF
GTID:2428330602971281Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of the number of scientific and technological literature,how to accurately locate the knowledge you need in the vast search resources becomes particularly important.The traditional way of organizing knowledge in literature is mainly to study external features such as title,subject,author,keywords,references,etc.,or to reveal the knowledge objects and semantic relationships in a single document,lacking The in-depth mining and organization of content between documents of the same subject leads to the majority of knowledge of scientific and technological literature content still exists in a "free state",and the lack of collaboration between knowledge makes it difficult to generate knowledge clusters and knowledge chains across documents.In this paper,based on the existing problems,on the basis of drawing on existing beneficial research results,this paper proposes a method for deep disclosure of scientific and technological literature content based on content maps.This method takes scientific and technological literature as the main research object and uses text mining technology The content of text fragments is used to extract the core knowledge objects and the semantic relationship between them,construct multiple scientific document content maps,and implement fine-grained description and multi-dimensional aggregation of document content knowledge based on the content map.The whole study includes three key issues:(1)how to extract the core knowledge objects and their relationships that meet the needs of the research from the scientific literature;(2)how to use the graph structure to express the extracted core knowledge objects and relationships;(3)How to use content maps to achieve deep disclosure and multi-dimensional aggregation of scientific and technological literature content knowledge.In view of these three key issues,the thesis has done the following three aspects of research:(1)After enriching the original text data set,a domain dictionary-based literature knowledge object and its relationship extraction method are designed;(2)consideration The strength of the relationship between the semantic objects in the semantic collection and the semantic sub-collections,a formula for calculating the importance of the knowledge objects is proposed,and based on this formula,the knowledge objects are sorted and the semantic sub-collections are extracted to construct a scientific and technological literature content map;(3)The content-based extraction The deep disclosure method of scientific and technological literature content of "discover downward and aggregate upward" of maps,with the help of multiple scientific literature content maps that have been constructed,generate cross-document knowledge clusters and knowledge chains in plane space and three-dimensional space,and realize literature knowledge Inference,from the dimensions of knowledge objects,semantic relations,knowledge units and statistics,to achieve a deep aggregation of knowledge content of scientific and technological literature.The paper selects 172 articles related to the theme of "Ebola" in the PubMed database to construct a text set,selects its title and abstract as the original data set,and conducts effectiveness experiments on the methods proposed in the entire study.By comparing the experimental results obtained by this method with the results of the widely used LDA method and Louvain algorithm,and retrospectively comparing the aggregation results of this method with the original literature content,the results show that the method proposed in this paper can The knowledge organization method extends from the external characteristics of the literature to its internal characteristics,which can reveal the knowledge of scientific and technological literature content in depth,and can realize the fine-grained description and multi-dimensional aggregation of scientific and technological literature content knowledge.
Keywords/Search Tags:Scientific Literature, Content Map, Deep Revelation, Semantic Analysis, Multidimensional aggregation
PDF Full Text Request
Related items