Font Size: a A A

Research On Deep Processing And Topic Evolution Of English Scientific And Technical Literature For Selective Dissemination Of Information

Posted on:2017-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y WuFull Text:PDF
GTID:2308330488961220Subject:Information Science
Abstract/Summary:PDF Full Text Request
Science and technology information and scientific knowledge are the key factors of scientific and technological progress and innovation. Scientific and technical literature resources have the characteristics of enormous amount, diversification and rapidly growing data in digital network environment. With the rapid development of science and technology, the exigent requirement for recent scientific and technical literature resources leads to the rapid increase in scientific and technologic literature, especially in English scientific and technical literature. Therefore, how to obtain useful knowledge from massive English scientific literature to meet the needs of the selective dissemination of information for different scientific researchers is the aim of this paper.This paper is a study of the English scientific and technical literature for the need of the selective dissemination of information in digital network environment. This paper is mainly based on theories and methods of the selective dissemination of information’s evolution and trend, the literature resources’storage and processing, topic analysis and evolution and knowledge storage and management, which fuses technical methods of key phrase extraction, topic extraction and evolution and so on. From these theories and technical methods, this paper discusses the problem of deep processing and topic evolution of English scientific and technical literature for selective dissemination of information and designs the prototype system, which can improves efficiency of processing, analyzing and using knowledge of thematic area in the work of science and technology information service.The main work and research of this paper includes the following three aspects:(1) This paper integrated the process framework of English scientific and technical literature’s deep processing. It introduced the architecture from three aspects, acquisition and import of resources, processing of resources and knowledge service of resources. With the guide of applicability and operability, key phrase extraction in processing of resources was mended and applied to the lifecycle and process of deep processing of English scientific and technical literature. Based on the N-Gram statistical model, the Snowball classifier and the features of English scientific and technical literature, combined with the features of TF-IDF, word frequencies, number of words, number of capital letters in the words and position of words, an improved fine-grained and multi-level key phrase extraction of English scientific and technical literature was introduced, which properly considering key phrase extraction to improve the correct rate and recall rate of English scientific and technical literature’s key phrase extraction. This paper collected 233 English journals of the artillery field as experimental corpus, which shows that the recall rate of this paper’s algorithm reaches to 61.10%, nearly three times of the KEA algorithm. Meanwhile, F1 of this paper’s algorithm also exceeds the traditional KEA algorithm.(2) The multi-dimensional and visual topic evolution analysis system of English scientific and technical literature based on key phrase extraction and topic model was proposed, which included the architecture, the measurement of topic intensity and the decision method of topic evolution. Time axis, frequencies of key phrase and other external characteristics were selected as parameters to analyze the topic based on statistical characteristics from different angles and dimensions. Meanwhile, topic model of English scientific and technical literature was selected to construct the three topological structures of literature, feature words and topics to analyze the topic based on the carriers of science and technology journals and news. English science and technology journals and news of the artillery field were selected to discover the topic analysis and the topic evolution law based on topic contents and time series, which can find the differences in time and writing styles of English scientific and technical literature based on different carriers.(3) The prototype system on deep processing and topic evolution of English scientific and technical literature for selective dissemination of information based on C# and the.Net platform was designed and developed. The new mode of the selective dissemination of information was proposed in network environment. This system platform has the advantages of flexible operation and expandable repositories, which can provide personalized, deep-level, multi-dimensional and fine-graining service of deep processing and topic evolution of English scientific and technical literature for science and technology researchers of library and information institutions.
Keywords/Search Tags:Selective dissemination of information, English scientific and technical literature, Key phrase extraction, Topic model, Topic evolution
PDF Full Text Request
Related items