Font Size: a A A

Research And Applications Of Hot Spot Discovery Methods For Academic Big Data

Posted on:2020-08-12Degree:MasterType:Thesis
Country:ChinaCandidate:S S ShanFull Text:PDF
GTID:2428330575479903Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Innovation is the core driving force for technological development and social progress.For researchers,keeping up with the latest academic hotspots,discovering new problems constantly and proposing new methods are the main way to maintain academic innovation.According to incomplete statistics,the number of academic papers published worldwide in2018 has reached more than 3 million.In addition,academic information includes news,blogs and more.Faced with a large amount of academic data,how to find information of interest quickly and efficiently is a challenge for researchers.An effective way to solve this problem is to mine the keywords that match the meaning of the article based on a small amount of text,find the current research hotspots from academic big data and recommend them to relevant scholars.Based on the above ideas,the main research contents of this paper are as follows:(1)A keyword extraction algorithm based on Deep Walk is proposed.Keyword extraction is the main technology for discovering academic hotspots.However,due to the relatively small number of academic papers in emerging research fields,the co-occurrence of keywords between articles is difficult to capture.Different from the existing methods,this article will extract each keyword as a separate individual.The specific steps are as follows:Firstly,in the semantic network composed of a single article,the random walk strategy is used to obtain the feature vector of each vocabulary;then,combined with other auxiliary features of the vocabulary,the higher ranked vocabulary is selected by the classifier as the keywords of the paper.(2)A keyword extraction algorithm based on graph convolution network is proposed.This algorithm applies the graph convolution network to the problem of keyword extraction for the first time.When a research field is relatively mature,there are a large number of related articles in the field,and the vocabulary in different articles has a complex co-occurrence relationship.This paper first models this relationship as a co-occurrence network between vocabularies.Then,combined with the attribute information of the vocabulary,the graph convolution network is used to extract the keywords of each article.(3)An academic hot spot discovery algorithm for topic clustering is proposed.Based on the above two keyword extraction methods,keywords from emerging and existing research fields are extracted.On this basis,the K-Means clustering algorithm is used to cluster the hot keywords in different fields,and the latest hot research topics in different research fields are obtained.The above three tasks are the important parts of the “Research,Development and Application of the Rapid Knowledge Sharing System in the Era of Big Data and Mobile Internet in Jilin Province”.This article has added the above work to the development of the "Academic Headline" APP(http://www.acheadline.com/)and achieved good results.At present,the APP has more than 7,200 users,more than 4.1 million papers,more than 6,000 journals,more than 6.7 million academic authors and more than 1.4 million keywords.In addition,based on the artificial dataset and the public datasets,the validity of the proposed algorithm is fully verified from the perspectives of accuracy,recall and F1.
Keywords/Search Tags:Academic Big Data, Hot Spot Discovery, Keyword Extraction, Knowledge Representation
PDF Full Text Request
Related items