Font Size: a A A

Research Of Topic Detection Based On Correlation And Graph Analysis Theories

Posted on:2020-08-19Degree:MasterType:Thesis
Country:ChinaCandidate:M L ChengFull Text:PDF
GTID:2428330578952877Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Due to the rapid development of Internet technology,various social networking sites and e-commerce platforms have emerged.As information carriers,these platforms have become an indispensable part of people's information sharing and maintaining social relations.Which makes the number of online texts explosive growth.How to detect valuable topics and its development trend from massive text information quickly and effectively has always been a hot topic in the field of data mining.As a representative method of topic analysis,topic detection and tracking aims to detect topics and their changing trends from various text corpuses.Here,topic detection as a sub-task of TDT,has become a probe for exploring unexpected events and tracking the development trend of specific social activities with efficient detecting of topics and trends.There are some topic detection studies with topic models to detect topics.Where LDA model is recognized as a valid algorithm for the reason that it provides a more natural way for text representation.But the premise is that the words in the documents are independent of each other,without considering the co-occurrence of words or terms,which preventing the detection of implicit and crucial topics.With the graph analysis method,others detect topics based on co-occurrence relationship,converting text data into terminology map based on that relations between words,and then partitioning the topics.This method focuses on network structure and ignores node attributes,resulting in the lack of meaning and semantic coherence among the generated topics.In order to integrate semantic relations and co-occurrence relationships,a joint theoretical framework based on LDA topic model and graph analysis is proposed,which can detect topics more effectively and mine important and rare topics.However,this method still uses LDA model in semantic relationships extraction.The assumptions of independent topics cannot reflect the correlation between topics,resulting in unrealistic modeling and low precision of semantic information extraction.In order to solve the above problems,this paper combines the relevance theory with the improved graph analysis method for topic detection.Firstly,this paper proposes a graph analysis method based on LDA cosine similarity.The LDA model is used to extract semantic information and topics.On this basis,the cosine similarity algorithm is introduced to calculate the cosine similarity between topics,and then quantify the correlation between topics.As the computational complexity is not high enough to makes up for the deficiency of the independent topic of the LDA model.Which improves the accuracy of semantic information extraction to some extent.To conform to the semantic environment of topic-related text expression in reality,this paper proposes a theoretical framework of graph analysis based on CTM model.The topic feature vectors are obtained by using CTM model of topic-related model,and then the vectors is optimized to extract the optimal feature considering the relevance of the topic also reduces the vector dimension.Finally,the CorrelationGraph algorithm is proposed,which quantifies the topic correlation and then uses them to extract co-occurrence relations.Topic-related relation is used to analyze semantic relationship and co-occurrence relationship at the same time,and the relationship between them is more fully integrated,so that high-precision semantic information and potential co-occurrence relationship are extracted,and then important implicit topics and their changing trends are discovered.The content of this paper is divided into five chapters.The first chapter introduces the research background and significance of this topic,and analyzes the current research status of domestic and foreign researchers in this field.In the second chapter,the related concepts and theoretical basis of topic detection are introduced in detail,and the mathematical knowledge and models involved in this topic are also explained.In the third chapter,the graph analysis method based on LDA cosine similarity for topic detection is introduced systematically and comprehensively,and the generation process and related experiments of this method are described in detail.The fourth chapter introduces the theoretical framework of graph analysis based on CTM model for topic detection,describes the process of topic generation and related theoretical reasoning by introducing CorrelationGraph algorithm,and then the simulations experiment is used to verify the validity of the theoretical framework.The fifth chapter is the summary of above study and the direction of future research.
Keywords/Search Tags:Topic detection, Graph analysis, Topic correlation, LDA model, CTM model
PDF Full Text Request
Related items