Font Size: a A A

Research And Application Of Top-K Closed High Utility Pattern Mining Method

Posted on:2020-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:S F WangFull Text:PDF
GTID:2428330623457656Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer and Internet technology,a large number of data have been generated.There are many interesting information in these data.Data mining technology can effectively mine and analyze these information,then can complete the mission of recommendation,prediction and classification.In the field of data mining,high utility patterns mining plays an important role.High utility pattern refers to all patterns whose utility value is greater than the minimum utility value specified by the user.There are a lot of information stored in high utility patterns,but still exist a lot of redundant patterns.The closed utility pattern proposed by research scholars can effectively reduce redundant patterns.If a pattern does not have the same superset as its own support,and the utility value is greater than the minimum utility value,then the pattern is a closed utility pattern.In practical applications,the mining of high utility patterns requires multiple attempts to get minimum utility value,which will require a lot of debugging time.The Top-K high utility pattern can effectively solve this problem.The pattern refers to k patterns with the largest utility value specified by the user.Although the closed high utility pattern solves the problem of the redundant patterns,there is still a problem of difficulty in debugging the minimum utility value.However,there are still a lot of redundant patterns in the Top-K high utility pattern.For these two problems,first of all,research and analyze the compact high utility pattern;then this thesis proposes a Top-K closed high utility pattern mining algorithm TKCU-Miner,and based on TKCU-Miner algorithm,propose a Top-K closed high utility association rule mining algorithm;finally,this thesis designs and implements Top-K closed high utility pattern Validation platform for pattern mining methods.The main research contents are as follows:(1)Introduce the research background of patterns mining,including the characteristics and related methods of frequent patterns mining and high utility patterns mining.The related concepts and characteristics of the compact high utility patterns are summarized.And the related mining methods of the three compact high utility patterns including Top-K high utility patterns,closed high utility patterns and maximum high utility patterns aresummarized and analyzed.And this thesis analyzes the characteristics and methods of other types of high utility patterns,including high-average utility patterns and sequence high utility patterns.(2)Research and implement the one-phase mining algorithm TKCU-Miner for mining Top-K closed high utility patterns.TKCU-Miner algorithm uses the improved uList structure to prune the traversal space by calculating the real utility and remain utility of the patterns,and uses the method of "verifying prefix item-adding suffix item" to generate closed high utility patterns,real time updating content stored of Top-K buffer in result set,and updating the minimum utility value at the same time.Finally,the performance of the algorithm is verified by comparing with other similar algorithms.(3)Research and implement association rule mining method based on Top-K closed high utility patterns,use utility matrix to store data information to calculate utility confidence,and design list index method to generate Top-K closed high utility association rules,while avoiding generating duplicate association rules.Finally,the distribution of rules is analyzed through different data sets.(4)Design and implement verification platform of Top-K closed high utility patterns mining with integrating the proposed algorithms,and design the preprocessing module,high utility mining module,association rules module and prediction module.The platform uses the data of user visiting behavior.Data format is processed by preprocessing module.User visiting time is analyzed by high utility mining module.Rules used in prediction module are generated in association rule module.Association rules are used in prediction module to predict user visiting place.
Keywords/Search Tags:data mining, patterns mining, closed high utility patterns, Top-K high utility patterns, association rule
PDF Full Text Request
Related items