Font Size: a A A

Research On Knowledge Element Extraction Based On Massive Academic Resources

Posted on:2015-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y WangFull Text:PDF
GTID:2298330422993103Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the vast academic resources, if the control unit of knowledge deepens from the literature to theknowledge element and the literature can be decomposed into an independent knowledge element, whichwill facilitate the storage and search of knowledge, and shorten the process of knowledge creation. Inaddition, the link of knowledge element in literature can reveal the implicit knowledge among related fields,which can dig out new knowledge unit and achieve knowledge appreciation and transformation, it alsoaccelerate the speed of knowledge innovation. The study of knowledge element in academic resources iscrucial to promote the process of human using knowledge and creating new knowledge.The extraction of knowledge element is a basic work in the research field of knowledge element.There is currently no unified cognition on theoretical research of knowledge element, and the definition onknowledge element model is distinct in different research fields of knowledge element. Although thecurrent study confirmed that the existence of implicit association in knowledge element among literatures,there still no effective method to extract knowledge element from the academic resources. The work bymanual operation is heavy and difficult to implement. Some scholars have tried to automatically extractknowledge element by computer, but the cognition of knowledge element on their respective views isdifferent and the extracting effect is not ideal, therefore these methods are not suitable for the extraction ofknowledge element in academic resources. This paper focus on the automatically extract research ofknowledge element based on academic literature resources.Firstly, combining with the characteristics of academic resources in digital library, this paper proposesthe seven-tuple structural model of knowledge element which can reflect the characteristics. Forcomprehensiveness of knowledge element extraction, the topic partition of academic literature resources isan indispensable work. This part introduces the normalized cut criterion used in topic partition work. In theuse of normalized cut, the weight matrix needs to reflect the degree of similarity accurately between thevarious nodes of text relation map, which affects the segmentation result of segmentation criteria. Thispaper calculates the degree of similarity between nodes from the point of semantic, completing theconstruction of the weight matrix, with which we propose the algorithm of topic partition based onnormalized cut, and proves that our method is effective in topic partition.Secondly, this paper proposes the extraction method of term definition sentence based on the academicliterature resources in the same field. The method first deals with the sentences in academic literature byhard-match method and generates the library of candidate term definition. Then it combines with degree ofdefinition membership algorithm and rank of sentence importance algorithm to enhance the accuracy ofterm definition extraction. The experimental result shows that the performance of the method is well.Then, the topic partition algorithm and term definition extraction algorithm are integrated into theextraction system. Aiming at the problem of the larger scale on academic literature resources, this partconstructs the text relation map using latent semantic analysis model firstly, which is prepared for topicpartition module. Then it builds the knowledge element extraction system by combining the term definitionextraction module, which complete the extraction of content description-an attribute of knowledge element. Meanwhile, this paper outlines the thought that the extraction of other knowledge elementattributes. Finally, it summarizes the whole research work and proposes the forecast to the future work.
Keywords/Search Tags:Academic Resources, Topic Partition, Term Definition, Knowledge ElementExtraction
PDF Full Text Request
Related items