Font Size: a A A

The Theme Discovery And Evolution Of Domestic Digital Library Research Based On LDA Model

Posted on:2018-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:J L WuFull Text:PDF
GTID:2348330518469901Subject:Library and file management
Abstract/Summary:PDF Full Text Request
Due to the greatly shortened publication cycle and the rapid development of science and technology,the literature present a feature of a large number and diverse themes.In this case,how to quickly get the research hotspots in the current field and grasp the research trend is a serious problem for researchers and research departments,and same is true for the field of digital libraries.Digital library as a basis infrastructure of the knowledge economy,is essential to the operation of the national economy.At the same time,digital library is the community's public information storage center and information transfer station,and provides comprehensive information services to the public.Digital library since the 90 s of last century put forward and put into practice,has experienced nearly 20 years of theoretical research and practical development,has now entered a relatively mature stage.The analysis of the subject matter of its academic achievements will help to discover the academic development thread,academic hotspot and academic development trend,which is helpful for the scholar to find a new research entry point and also to promote and enhance the vitality of the digital library.LDA(Latent Dirichlet Allocation)as a classical effective probability generation model,includes text-subject-term three-layer Bayesian structure.It can dig out the latent semantic information in the text and has been widely used in the text classification,information retrieval,emotional analysis,Topic digging and other fields.LDA plays an important role in the scientific literature subject discovery and evolution research.At the same time,the title,abstract and key words in the scientific literature are an important part of the literature,which usually represents the author's concentration and summary of the main points of the article,and can play a great role in the analysis of the subject matter.Unfortunately,the existing digital library related topics have failed to value and use these elements.This paper uses the LDA model to extract the contents of the domestic digital library research papers in 2007-2016 for nearly ten years,and analyze the theme structure,reveal the hot topics and the theme evolution process,and finally discuss the evolutionary results with the actual background.This article hopes to provide reference and support for the relevant research and work of digital library,and thus promote the healthy development of digital libraries.The details are as follows:(1)This paper summarizes the existing methods of subjective identification and evolution analysis,and analyzes these methods in detail from the aspects of basic principles,research status,advantages and disadvantages.This paper studies the complete process of modeling with LDA(including Gibbs parameter estimation method,optimal subject number determination method,subject filtering based on information entropy,hot topic selection method,posterior discrete subject evolution method and subject evolution measurement method and so on).This Paper refining the key issues and a subject trend recognition method based on subject intensity clustering is proposed.(2)Select the domestic digital library 2007-2016 ten years of journal articles,combing with the time factor,we use LDA to carry out the theme evolution analysis,to identify the digital library research theme structure(user research,construction countermeasures,evaluation research,information Services,education and training,knowledge management,resource organization,resource sharing,copyright research,mobile library,resource storage and security,field research review,application technology research,digital library based on cloud computing).These can provide reference solutions for digital library researchers and digital library managers and builders.(3)Using the same data,we find out that the information service and the development strategy are the research topics which are stable and highly concerned by the researchers.The organization and construction of the resources,the application technology and the copyright problem are the stable research in the field of digital library;using the discrete subject evolution method to carry out the theme evolution analysis,draw the 14 themes 10 years of intensity evolution trend curve.By using the subject trend analysis method based on intensity clustering,14 themes are divided into ascending,descending,stationary.Article judgment that user research and mobile library are the emerging research topic of digital library,and that their subject heat will rise in the future.
Keywords/Search Tags:Digital Library, Theme Discovery, Theme Evolution, LDA
PDF Full Text Request
Related items