With the advent of the era of big data,the rapid and accurate identification and prediction of hot research topics from scientific and technological literature are an important means of helping researchers to understand the development status of specific research fields.From the mass of academic literature,the hotspots of research and the evolution of research topics in the field of science can be found.This will not only save researchers a lot of time and money,but it can also help many sciences and technology innovation subjects,as well as science and technology policymakers,consider the existing state and future trends of relevant research fields from a macro and overall viewpoint.Therefore,it is of great practical significance to identify the hot topics in the field of science and technology and predict the trend of their popularity.Based on the above background,the focus of this thesis is topic identification and popularity prediction of scientific literature.This thesis proposes an online topic model for scientific literature and a topic evolution prediction method.On this basis,an intelligent analysis and prediction system for scientific and technological information has been developed.The main work of this thesis is as follows:(1)An improved OLDA online theme model is constructed.In the traditional OLDA model,the weight of the content evolution matrix is fixed,which leads to problems such as the mixing of new and old topics.A dynamic weight calculation method was proposed.Combining with the topic similarity matrix,the pseudo-variable length topic content evolution matrix is constructed to improve the modeling effect of the topic model.According to the IOLDA online topic model proposed in this thesis,experiments are carried out on the collected data sets of scientific and technological literature.The experimental results show that the IOLDA model is superior to other models in the indicators such as degree of confusion.(2)A combined ESA prediction model for topic heat is proposed.Aiming at the existing problems such as simple theme heat characteristics,a heat index system of science and technology theme based on the intensity of science and technology theme and other characteristics was constructed to improve the robustness of topic heat prediction.At the same time,considering the characteristics of periodicity and trend of timeseries,the ESA combined prediction model is proposed in this thesis,and the time series data of topic heat automatically extracted by the IOLDA model is used to carry out the prediction experiment.The experimental results show that the ESA combined prediction model is superior to other comparative models such as LSTM in terms of MAE and RMSE indexes.(3)An intelligent system for the analysis and prediction of scientific and technological information for scientific and technological literature is developed.The system is based on the intelligent analysis project of scientific and technological data,which focuses on the discovery of the theme and evolutionary trend prediction,planned and realized the overall structure and key function modules of the Intelligent Science and Technical Knowledge Analysis and Prediction System.Finally,the integration testing for each functional module was done.The test results show that the system basically meets the requirements of the analysis and prediction of scientific and technological information. |