Font Size: a A A

Topic Discovery From Social Network Texts With Heterogeneous Semantic Features

Posted on:2022-08-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:J HeFull Text:PDF
GTID:1488306560453664Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The topic model is an effective technique for text analysis,and has been widely applied in public opinion analysis,Q&A systems,personalized recommendations and other fields.With the rapid development of social networks and the emergence of diversified application platforms,in the process of generating multi-sourced and real-time data,heterogeneous semantic features have arisen in terms of the type of data composition,the degree of attention received by users,and the timeliness of topics,which pose new challenges and problems for traditional topic models to learn and present the heterogeneous semantic features.Therefore,proposing effective representation and learning methods for heterogeneous semantic features has important research value and practical significance.In this thesis,based on the research of normal topic discovery,targeted topic discovery,and knowledge graph embedding,we propose several topic discovery algorithms for social network texts with specific types of heterogeneous semantic features.The main research and contributions of this dissertation are summarized as follows.(1)Propose a method for discovering heterogeneous text topic is proposed.Different types of data may be generated on the same social network platform,but there will generally be shared topics.Conventional topic discovery methods focus on only one data type,which leads to poor effectiveness on the heterogeneous semantic features produced by topic structure and topic density.To this end,this thesis proposes a topic discovery method SSWTM based on self-adaptive sliding window.By adaptively adjusting the size of the sliding window to extract the word pairs of documents,this method not only considers the sparsity of short texts,but also avoids the topic redundant of long texts.Experimental results demonstrate that our method is suitable for the topic discovery with heterogeneous types of text data,and has excellent performance on document classification.(2)Propose two targeted topic discovery methods.The topics in a text receive different degrees of attention from users with different interests,resulting in the heterogeneity of data attention.Conventional topic discovery methods are generally based on full analysis,failing to highlight the targeted topics in a particular domain.To this end,this thesis proposes two methods,TATM and HFTM,for targeted topic discovery.The former balances topic homogeneity and topic completeness,while the latter constructs a hierarchical semantics for targeted topics,both of which effectively refine the granularity of topic discovery.Experimental results show that TATM and HFTM are suitable for targeted topic discovery tasks and solve the sparsity of semantic features of targeted topics.Furthermore,the two proposed methods have better time efficiency than baselines.(3)Propose an interpretable dynamic topic discovery method.The text content and data associations of social networks change with time,generating dynamic topic structure,and the meaning of topic words shifted at different time points.To this end,this thesis proposes an interpretable topic discovery and tracking method KITE,which incorporates global knowledge and local knowledge,achieving the topic discovery with heterogeneous timeliness texts.Experimental results validate that our method not only improves the topic interpretability by leveraging knowledge graph embeddings,but also improves the model sensitivity to the topic evolving by updating neighbor knowledge.
Keywords/Search Tags:Topic Discovery, Social Networks, Heterogeneous Semantic Features, Targeted Topic Modeling, Knowledge Graph Embeddings
PDF Full Text Request
Related items