Font Size: a A A

Mining Mechanism Of Knowledge Clusters And Associations Based Cloud Platform

Posted on:2017-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:J J LiuFull Text:PDF
GTID:2348330488997107Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Under the background of informatization and large volume data era, all kinds of text data, which include: Search Engine Query, E-Commerce user review, and quotation of article, etc. has been rapid appearing in people's daily life at present. The time and query volume that people spent are obviously increased as well. In order to promote service quality and motivate recreation of the service, data mining works through short text data, provide text semantic meta tag are significantly necessary. Further more, multi- grain data mining of text citations, references, keywords, etc. in other aspect will boost audiences reading efficiency and quality. The whole idea of this article is the first of a static text clustering, making the automatic archiving text messages, and then based on the user's browser process to do dynamic association r ule analysis, dynamic text data frequent item sets, and finally the frequent item sets in cluster analysis res ults find its associated rules. This method will improve the efficiency of the query text information and also has a very important significance and application prospects.This article is based on in-depth understanding of cloud platform and data mining, to make the existing clustering based on improved, and document analysis to extract multi-dimensional strategy, outlier detection and initial improvement center, Map Reduce processing performed on the cloud platform, to improve the quality of clustering and efficiency. Users browse for dynamic process, is proposed based on weighted matrix FP-Growth association rules, the elapsed time factor was filtered to give the initial matrix, and further calculates the weight vector for the FP-Growth algorithm improvements. At the same time, solve the dynamic affairs, set some updates and support change issues, and based on clustering results category filter, parallel processing on a cloud platform, to improve the performance of the algorithm and spatial efficiency, and ultimately a more efficient, more accurate frequent item sets a foundation for future research push. Finally, article verify the above improved algorithm on the experimental platform, the algorithm performance and efficiency a lot.
Keywords/Search Tags:Data mining, Hadoop, Clustering, Association Rules, Map Reduce
PDF Full Text Request
Related items