Font Size: a A A

Research And Application Of Concise High Utility Patterns Mining Algorithms Over Data Streams

Posted on:2022-12-29Degree:MasterType:Thesis
Country:ChinaCandidate:H D ChengFull Text:PDF
GTID:2518306752983909Subject:Computer technology
Abstract/Summary:PDF Full Text Request
High utility patterns mining has widely used in data stream environments such as retail market,wireless sensor network and stock market prediction.Mining high utility patterns in real-time data streams was a challenging problem because of the unbounded,continuous,and high-velocity nature of these data.Aiming at the low efficiency of Top-K high utility patterns mining algorithms over data streams and the information loss in the results,this dissertation studied concise high utility patterns algorithms based on the sliding window.The main contents were as follows:(1)The research proposed an efficient algorithm of Top-K high utility patterns mining over data streams based on the concise window data list(CWData List)devised.Specifically,ETKDS first stores the position offsets of items in the CWData List,and then constructs depth projections through the stored information to extend patterns.ETKDS also proposed a reorganization technology for projected transactions in common batches,which always sorted the window transaction items in the optimal order to ensure the correctness and efficiency of the projection process.In addition,in order to make the algorithm obtain a higher initial utility threshold and shorten the mining time,a cooccurrence utility decreasing order hash table was researched and designed,and a new threshold promotion strategy was proposed using the cooccurrence utility stored in the structure.ETKDS used a transaction merge mechanism to reduce dataset scanning costs.The experimental results indicated that ETKDS has obvious advantages in terms of time and memory compared with other related algorithms.(2)The research proposed a closed high-utility information list(CH-List)suitable for the sliding window model.With the help of CH-List,the first algorithm for closed high utility patterns mining over data streams based on sliding window model(CHUP?DS)was proposed,which constructed longer patterns by intersecting CH-List tuples of different items to achieve target pattern search.Through the proposed batch based remaining utility table,a reorganization technology for transaction tuples in common batches was designed to ensure the stable operation of the pruning strategy in each window.CHUP?DS additionally used a remaining utility based pruning strategy during closure computation.Experimental results indicated that its overall performance outperforms existing related static and incremental algorithms.(3)A sudden twitter topic prediction platform based on CHUP?DS algorithm was designed and implemented,which mainly included a dataset upload module,a sudden topic prediction module,a disaster event detection module,and an analysis module.Users can learn about daily sudden hot topics and the occurrence of disaster events,which was convenient for users to quickly understand social public opinion and make decisions.
Keywords/Search Tags:Data streams, high utility patterns, Top-K high utility patterns, closed high utility patterns, sliding window
PDF Full Text Request
Related items