Font Size: a A A

Research And Application Of Implicating Rules Extraction Of Dynamic Data

Posted on:2018-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:J X DuFull Text:PDF
GTID:2348330512475469Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the era of large data,structured data,semi-structured data and unstructured data exist widely in various fields of social life,with a large number of new data in the continuous generation.So,it is not only an important scientific issues at presence but also an urgent need for practical application to how to deal with the dynamic characteristics of the data.Implicit relationship is an important form of relational description in logic.The study of the implication rules of dynamic data has important theoretical research value and broad application prospect.The problem of implicit rule extraction of dynamic data is studied.This paper discusses the processing method of dynamic data and studies the association rule extraction method of incremental data;The Hadoop computing environment is used to design the parallel association rule extraction algorithm;The extraction method of implicit association rules is studied.The main contents of the study include:In order to solve the problem of dynamic data processing,the incremental data processing method of dynamic data is summarized.Incremental data is a typical form of dynamic data.Incremental data is usually processed by merging the new data set with the original data set,and then the entire updated data set will be dealt with series of corresponding treatment.But,this method is inefficient and wasteful of computational resources.Therefore,a new approach is applied.It is just need to process the new data in parallel and extract the rules or knowledge,and combine with the previous processing results to complete the incremental rule update.Based on the method of incremental data processing,an incremental association rule updating algorithm DIAR is proposed for the association rule algorithm at large data environment.Incremental association rule updating algorithm,on the basis of traditional association rules is a low-cost method of getting new rules by using the rules that have been discovered already.The algorithm is implemented in the Hadoop parallel computing environment.For the first,frequent itemsets in the original data set are extracted by using the association rule algorithm,and then deal with the new data set by using the association rules algorithm,so that the frequent characteristics of the new data can be acquired.At last,analyse the relationship between the frequent itemsets in the old and new data,and divide the data into different sets.Aiming at the validity of association rules,the concept of implied strength is introduced and used as an index to weigh the validity of association rules,that is,the association rules.Implicit association rules simplify the rule set and eliminate the unwanted,redundant rules.The experimental environment adopts the Hadoop parallel computing environment to validate the DIAR algorithm using the data set captured on the web page.The size of data setis 1G,2G,3G,4G and 5G.The size of the new data set is 0.1G.The experimental results verify the effectiveness of the method.
Keywords/Search Tags:rule extraction, implicit rules, dynamic data, incremental data, parallel computing
PDF Full Text Request
Related items