Font Size: a A A

Research And Partial Realization Of Association And Classification Under Big Data

Posted on:2019-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:S Q MiaoFull Text:PDF
GTID:2428330548986987Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Associative classification algorithm is a kind of algorithm that combines the association rules and classification techniques.It has been widely studied by researchers because of the high extensibility and accurate classification characteristics.The algorithm mines class association rule data,and uses the obtained frequent rule data as classification model to classify and predict.Associative classification algorithm is an important aspect in data analysis and processing.However,the associative classification algorithm also has some problems.In the execution of associative classification algorithm,it could generate a huge number of redundant rule data.These rule data not only bring the resource consumption in memory space,but also affect the classification result.In addition,under the circumstance of imbalanced data,there exist a problem due to the support of minority data are too low to be discovered.Aiming at the existing problems,the main contributions in this paper are as followings:(1)It proposes an associative classification algorithm based on twice learning.In the twice learning stage,the improved algorithm combines naive bayes classification,and effectively solves the problem of CBA algorithm that could not generate association rule data with the data need to be classified.A series of experiments prove the effectiveness of twice learning associative classification algorithm,and also improve the accuracy of classification result.(2)It proposes a weighted associative classification algorithm.When the CBA algorithm is dealing with imbalanced data,it could exist a phenomenon that the support of minority data is less than min-support,which lead to the difficulty of discovering minority data.The weighted associative classification algorithm gives weight to each data item and class attribute data,and it ensures the generation of minority data by calculating the weighted support.The verification of experiment shows that the weighted associative classification algorithm could accomplish the mining of minority data and improve the ability to predict with minority data.(3)It proposes a method of optimized associative classification algorithm.Because of the classifier construction only relies on confidence,it could lead to the generation of overfitting problem.The optimized algorithm combines the length ofantecedent rule,confidence,support as the frame,and complete the construction of classifier with the best rule data.It not only reduces the appearance of redundant rule data from memory space,but also reduces the occupation of memory space and achieves the effect of improving performance.
Keywords/Search Tags:associative classification algorithm, CBA algorithm, twice learning, weighted associative classification, imbalanced data
PDF Full Text Request
Related items