Font Size: a A A

The Research Of Sub-Topic Division Methods Based On Full Covering And Granular Computing In News Documents

Posted on:2018-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:J Q SuFull Text:PDF
GTID:2348330536465880Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In the modern world,going with data exploding,information from all directions,such as wave pouring into human life.In the face of such huge data,users who want to get their interested news topics quickly and accurately in the mass information,will face great challenges.How to classify and organize a large number of news and events has become an important research topic in natural language processing.In order to summarize the information of related topics for people to browse automatically,the topic recognition and classification technology emerge as the times require.It is dedicated to the study of effective organization,search,and structuring of different text sets.Full coverage granular computing is a new research method for information processing and data mining.It provides a new idea for the mining of massive data with uncertain and incomplete information.It includes the theory of full coverage and granular computing.In addition,it provides a new solution for sub-topic division.1.In this dissertation,the LDA topic model is used to analyze and establish the model to extract the hidden topic of news documents,and get the "document-topic" ? matrix.Through multiple experiments,the appropriate threshold of probability in the matrix is set,and then the q matrix is convertedinto a full coverage model.On the basis of full covering granular computing,the method of granular reduction is used to delete redundant covering elements,and get the simplest covering elements.2.From the perspective of set theory,this dissertation uses the idea of ring and proposes the algorithm of Derived Partition,which discussed the theoretical basis of the algorithm and the time complexity of the algorithm is analyzed.In addition,the structure and process of the algorithm are optimized.Through a large number of experimental verification,it shows that the performance of the algorithm is improved.Finally,the algorithm is further explained by an example.3.On the basis of LDA topic model and Derived Partition algorithm,this dissertation designs a sub-topic division method based on full coverage granular computing.By comparing with the three traditional Baseline methods and the classical Single-Pass method in the Sogou news corpus,the applicability,feasibility and scalability of the proposed method are verified.The results show that the algorithm can achieve sub-topic division better.
Keywords/Search Tags:full covering granular computing, topic model, derived partition, sub-topic division
PDF Full Text Request
Related items