Font Size: a A A

Research On Uncertain Frequent Graph Data Mining

Posted on:2015-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:L B HeFull Text:PDF
GTID:2298330467988815Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The21st century is an information-oriented era, every minute of all walks of life in the pursuitof the pace of information, how to find valuable information from the mass of information is amajor challenge. Due to the advantages of graph data mining graph data structure favored by themajority of research enthusiasts, and has been widely used in various industries, such as thebiological field, the field of chemistry, social networking and so on. How to find interestingpatternsfromgraphdatahasbecomean importantfield ofdataminingresearch.Based on the study of graph mining application background, basic principles and researchstatus on the graph mining, analyzed some problem which is still existing in graph mining,andproposed improved algorithm for the disadvantage of mining uncertain frequent subgraph atcurrent stage.The mainresearch isas follows:(1)The classic MUSE of uncertainty for mining frequent subgraph algorithm needed big timefor calculating expected support.So its calculation efficiency is not high.So proposed a method thatusing BFS thought to find frequent subgraph patterns from edge support tree. Firstly use improvedgSpan algorithm to deal with uncertain graphs to reduce the space of subgraph patterns,thendividing for the frequent data,and making a support tree in each small data sets.Lastly using BFSalgorithm to mining frequent subgraph.The subgraph isomorphism tests and the edge whetherexisting tests indicates that EBFS is more efficiency than MUSE.It will be more efficient in largedataset.(2)As the RAKING algorithm of mining greatly uncertain frequent subgraphs exists timeefficiency and space efficiency not high enough problem,proposed UDMFS algorithm.Thisalgorithm is based on EBFS algorithm and consists of two phases:The fist phase is generatinguncertain frequent subgraph by EBFS algorithm.the second phrase is generating greatly frequentuncertain subgraphs with the rightmost rule extending a new edge,a new greatly frequent candidatesubgraph is generated.If its expected support is bigger than the min_sup and it cann’t extend a newedge any more, it will be a member of greatly frequent uncertain subgraphs.Finally tests indicatesUDMFS algorithm is more efficiency than traditional RAKING algorithm on mining uncertaingreatlyfrequentsubgraph.
Keywords/Search Tags:Uncertain data mining, Expect support, Division, Edge support tree, Greatlyfrequent patterns, Therightmost rule, Greatly frequent uncertain subgraphs
PDF Full Text Request
Related items