Research Hotspot Identification Based On LDA2vec Model Under Multisource Data

Posted on:2020-09-21

Degree:Master

Type:Thesis

Country:China

Candidate:H L Qiu

Full Text:PDF

GTID:2428330575952590

Subject:Library and Information Science

Abstract/Summary:

PDF Full Text Request

Information overload is a major issue worthy of attention in the current Internet information era.It is especially important to extract and extract key information from massive information.The scientific literature as the main carrier of scientific and technological innovation knowledge has grown exponentially,and has many characteristics such as multi-source distribution and diverse description formats Different types of documents such as paper patents,conference reports,and government publications may provide the same subj ect.Description of different angles.Therefore,in scientific research,identifying mining scientific research hotspots from different sources of scientific literature is of guiding significance for carrying out the next scientific research work.Therefore,the purpose of this research is to quickly and accurately identify the hot topics contained in the texts of multiple data sources through the proposed model method,and provide information support services for scientific and technological innovation decisions.This research firstly uses the literature research method to analyze the research hotspots and scientific research themes,and conducts research on the research methods of the main methods and thematic models of domestic and international research hotspots,and summarizes and reviews the representative research results.This paper combs the five methods of expert method,citation analysis method,knowledge unit analysis method,map analysis method and text mining method in the current scientific research hotspot identification analysis,and the theoretical exploration of the topic model and its application in scientific research hotspot identification.The research status is summarized.Then based on the model research method,this paper proposes a method based on LDA2vec model for multi-source text research hotspot identification and builds a model for scientific research hotspot identification.This method combines the advantages of LDA topic model to implicit semantic mining and Word2Vec The advantage of the word vector model for grasping contextual relationships.At the same time,in order to verify the effectiveness of the method,using the experimental analysis method,statistical analysis method,etc.,taking the scientific literature in the field of machine learning as an example,the title and summary data of the journal papers and patent documents are obtained for fusion as experimental data sources.The model's perplexity and topic coherence are used to compare the topic extraction effects of LDA2vec and LDA in the context of multi-source text.On the other hand,the method of this study is based on multiple data sources and single data sources.Under the environment,the theme extraction effect is observed and compared.After experiments,the results show that the method proposed in this paper is feasible and can be improved to some extent in the face of multi-source data.The method can relatively quickly and accurately identify the hot content in the multi-data source text,make up for the shortcoming of the single analysis data source for subject detection,and enrich the practical application of the multi-data source fusion theory system.

Keywords/Search Tags:

Topic model, LDA2vec, Research Hotspot, LDA, word2vec, multisource data fusion

PDF Full Text Request

Related items

1	Research Hotspot Situation Analysis Based On Topic Model
2	Research Of Topic Detection And Tracking Based On Multisource Data
3	A Personalized News Recommendation Research Based On Lda2vec And Restricted Boltzmann Machine
4	Research On Model Of Hot Topic Opinion Mining In Virtual Communities
5	Research On Multisource Image Fusion Algorithm And Its Applications Based On Statistics And Reasoning
6	Research On Topic Clustering Algorithm Based On Topic Models
7	Research On Estimate And Forecast Technologies Of Microblog Hotspot
8	Research On Anomaly Detection Oriented Technologies Of Multisource Remote Sensing Imagery Fusion
9	Research Of Recommendation Algorithm Based On LDA And Word2Vec
10	Research On Hot Topic Detection Technology Of Weibo Based On Word2vec