Font Size: a A A

Hot Topic Prediction Based On Time Series

Posted on:2020-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:X P NieFull Text:PDF
GTID:2428330575451908Subject:Agricultural information management
Abstract/Summary:PDF Full Text Request
In a specific subject research field,how to identify the research hot topics from scientific and technological literatures is important for understanding the development of the research field of the subject,and promoting the formulation of scientific and technological policies.In this paper,the Relim algorithm is used to identify the topics in the field of "animal genetics and breeding",and four machine learning algorithms are used to forecast hot topics in research.At last the experiment shows the integrated forecast model is better for topic forecasting,and choosing two-step ahead is the best.The hot topic prediction is helpful to understand the status of hot topics in the coming a period of time.The main research work of this paper is as follows:(1)Research hot topic identification based on the Relim algorithm.Under the comparative study of several algorithms for identifying topics,the Relim algorithm is more suitable for topic prediction,and the core algorithm for automatically identifying research hot topics from scientific and technological literatures in the field of animal genetics and breeding domain.In this paper,283 hot topics such as "animal","association","behavior","animal_association_behavior" and "breed" are mined from experimental data.In order to eliminate redundant data,the research hot topics are simplified to 250 hot topics such as "animal_association_behavior" and "breed".The frequency of simplified hot topics is formed as time sequence data from 2000 to 2017.(2)The prediction of hot topic evolution trend based on machine learning algorithms.In this paper,linear regression,support vector machine,radial basis function regression and radial basis function neural network are used to predict the trend of hot topic “breed”.Through comparison,it shows that for the same time series,there are great differences among the mean square error,root mean square error and mean absolute error values of the four machine learning algorithms due to the diversity or independence of different prediction algorithms.The forecast value of a single forecast model is integrated to predict the trend of hot topic evolution.In the integrated prediction model,which connected the prediction model with poor performance and the prediction model with better performance can finally get the prediction model with more stable performance.Through the prediction experiment of the topic "weight body" with five steps in ahead,it proves that the prediction scheme with two steps in ahead is optimal.Finally,the integrated prediction model is used to predict the topic "ability","acid","activation" in two steps ahead.After 2017,the frequency of the topic "ability" shows a decreasing trend.In 2018,the frequency of topic "acid" shows a decreasing trend,but in 2019,the frequency of topic shows a rising trend.The frequency of topic "activation" keeps steady after 2017.The experimental results show that the method used in this paper can accurately predict the trend of hot topic evolution in the field of animal genetics and breeding,especially the hot topics in the next 2 years.This method is also suitable for the prediction of hot topics in other disciplines or fields based on scientific and technological literature to help users quickly discover the status of hot topics in the future.
Keywords/Search Tags:Topic identification, Topic prediction, Machine learning, Integrated forecast
PDF Full Text Request
Related items