Font Size: a A A

Research Status And Hotspots Of Statistics In China Based On Text Mining

Posted on:2020-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:S C HanFull Text:PDF
GTID:2370330578981630Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
In this paper,using the description of the literature,the analysis method of text mining technology,by studying the high-level literature in the field of statistics in China,to analyze the research status and research hotspots in the field of statistics,and to describe the research situation of theoretical methods and application fields in the field of statistics in China,provide reference for subsequent researchers to grasp the latest developments in the field of statistics.First,using python crawler technology to obtain 4605 high-level documents from 2016 to 2018 in the field of statistics.Among them,2607 dissertations;802 articles in Statistical Research and Mathematical Statistics and Management;in the Web of Science database,Chinese scholars published 1196 high-level statistical documents.Then using text mining technology to preprocess the literature data.This includes deleting the missing keywords and abstracts;deleting the literature unrelated to statistical research;the English in the literature data is unified to lowercase;the semantically identical keywords are unified and the document abstracts are segmented.Secondly,using the description method of literature to describe and analyze the literature data,the research status of statistics is obtained: the number of Chinese documents is decreasing,English documents is the opposite;institutions of higher learning are the main force of statistical research;the citation rate of journal articles is much higher than that of dissertations.Then,using the common word analysis method,the paper analyzes the literature keywords with the year as the time node,and obtains the main research contents in the domestic statistics field in the past three years,including economics,people's livelihood,big data,statistical methods research and data processing.It shows the combined use of methods in statistics and the changes in the content of the research.At the same time,comparative analysis of published literature at home and abroad shows that foreign literature focuses more on theoretical research,and Chinese literature pays more attention to practical applications.Finally,the LDA topic model is established for the literature abstract,in order to identify and analyze the research hotspots in the field of statistics.Discover twelve hot issues in the field of statistics.Comparing the hotspots of Chinese journal literature and foreign journal literature research,it is found that Chinese literature takes economic development and people's livelihood as the main research direction.The English literature mainly focuses on social issues and environmental issues.The year is the time node to show the changes in the research hotspots in the field of statistics in China in the past three years.
Keywords/Search Tags:text mining, co-word analysis, reptile, LDA, statistics
PDF Full Text Request
Related items