Font Size: a A A

Improved Text Topic Representation And Learning Method

Posted on:2019-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:H R ZhaoFull Text:PDF
GTID:2348330566959847Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,a lot of text data have emerged on the Internet.In order to make more accurate and deeper understanding and semantic analysis of text data,we use data mining methods and deep learning methods to improve the existing topic models,and get more explicit topic expressions and richer topic semantic information.This method is of great significance to the representation and learning of text topics.We have conducted research on improving the topic representation and the topic learning methods of texts,which are as follows:1.There are some defects in existing topic model in the context of representing the semantics of the text topic including ignoring semantic and grammatical correlations among words and poor interpretability.In order to alleviate these problems,a semi-supervised topic representation method based on association rules and metadata was proposed.We got the weighted association rule algorithm by adding the weight value of each term in the traditional association rule mining algorithm.Based on the topic model,the relationship among words was discovered,and the topic semantics were expressed in the form of triples(term1,relationship,term2).Not only could reduce the number of redundant topics,but also increased the semantic relationship information among words,and finally got a more detailed and more explicit form of the topic.The experimental results show that compared with other topic semantic representation methods of text,the improved method increase the semantic relationship among words and enrich the topic semantic information,which improve the interpretability of the topic.2.There are some defects in existing topic model in the context of learning the semantics of the text topic including poor semantic accuracy,coarse granularity and difficulty in calculating topic similarity from the semantic level.Topic2 Vec model was established based on the deep learning.Combining topic learning with distributed word vectors of the neural network learning,while learning the distributed vector representation of words,learning the distributed vector representation of topics.It can not only improve the accuracy of topic semantic learning,but also refine the granularity of topic semantic learning,make it easier to calculate topic similarity from the semantic level.The experimental results show that the improved topic semantic learning method is superior to the traditional method in terms of topic extraction accuracy,topic presentation granularity,topic differentiation and semantic similarity calculation,which fully validates the effectiveness of the method.
Keywords/Search Tags:Topic model, Weighted association rule mining, Deep learning, Word embedding, Topic semantic similarity
PDF Full Text Request
Related items