Font Size: a A A

A Method Based On Sentence Embeddings For The Sub-Topics Detection

Posted on:2019-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y XieFull Text:PDF
GTID:2428330611493294Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Online social networks have gradually permeated every aspect of people's lives,including politics,education,economy and culture.Online social platforms have become an important place for people to make comments.For a hot topic,there are many different sub-topics,which describe hot topics from different aspects.Therefore,how to discover sub-topics accurately and effectively is of great significance to public opinion analysis.The main purpose of sub-topic detection is to make a more detailed analysis of the topic,dig out the focus of public attention and predict the evolution of the topic.This paper mainly studied the technology of weibo sub-topic detection based on sentence embedding,through the further study of the traditional word embedding and sentence embedding,combined with the characteristics of the weibo this,think the essay because we cannot provide enough context information,the traditional sentence embedding for the representation of a short text isn't good enough,also need rich by extending the characteristics method of short text.On the basis of the existing technology,this paper proposes a method of finding sub-topics based on sentence embedding.By combining topic information and sentence embedding,the semantic features of sentence embedding are enhanced to achieve the goal of discovering potential sub-topics under the same topic.This paper distributed implementation of the proposed subtopic discovery method is carried out on big data platform CDH.The ability of text representation and sub-topic detection were respectively verified on weibo data.The experimental data set was constructed by capturing the weibo post on the topic of the 2015 Tianjin port explosion event,which was compared with the benchmark method.The experimental results proved the effectiveness of the method proposed in this paper.
Keywords/Search Tags:Sub-topic detection, Sentence embedding, Cluster
PDF Full Text Request
Related items