Font Size: a A A

Research On The Mining Method Of Proximity Sequence Pattern

Posted on:2013-08-13Degree:MasterType:Thesis
Country:ChinaCandidate:W J LiuFull Text:PDF
GTID:2248330395952407Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Facing the rapid growth of data information, people are suffered the hugepressure of "the information explosion". At the same time,they are got into thedilemma of "data is rich, knowledge is little". The generation and developmentof Data mining provides powerful measures for people to get out of the difficulty.Essentially, Data mining let their data explain the value of their own, which isexploring the lots of enterprise data according to the established businessobjectives,and revealing the hidden regularity and further the modeling of theadvanced,effective methods.Mining graph patterns in large information networks is critical to a variety ofapplications such as malware detection and biological module discovery.However,frequent subgraphs are often ineffective to capture association existing in theseapplications, due to the complexity of isomorphism testing and the inelastic patterndefinition. So gragh patterns can’t get all the rules in the social life. Arijit, Yanintroduce proximity pattern which is a significant departure from the traditionalconcept of frequent subgraphs,for the first time.Defined as a set of labels thatco-occur in neighborhoods, proximity pattern blurs the boundary between itemsetand structure. It relaxes the rigid structure constraint of frequent subgraphs, whileintroducing connectivity to frequent itemsets. Therefore, it can benefit from both:effcient mining in itemsets and structure proximity from graphs.There are manyconditions needed to be solved by proximity patternl, and frequent subgraphs anditemset mining cannot satisfy the social life.However, proximity pattern is applied in undirected graghs. With the advent ofof Twitter and SNS,there are lots of indication can be modeled by directed graghsstructure.The technology used under undirected graghs cannot meet the figure forsome knowledge society. So,this paper proposes a new kind of association rules,which is an application of proximity pattern in directed graghs. Based on proximitypattern,proximity sequence pattern makes the change on traditional sequence pattern, and makes the range be wilder.Proximity sequence pattern not onlyconsiders the sequence of every transaction,but also contains the sequence of eachtransaction. The appearing of proximity sequence pattern,is filling the gap ofdirected gragh application of proximity pattern,and make the applied range morewilder. And do more help of mining on some kind of network information, socialphenomenon.To mine the proximity sequence pattern,this paper puts forward a kind ofinformation transmission model,which can transfer a a complex digraph informationdatabase into a simple probabilistic sequence transaction set. According to thedifferent association calculation method, information transmission can be dividedinto two different models,one is Nearest Probablistic Association,and the other isNormalized Probabilistic Asscciation.After the conersion to probabilistic sequencetransaction set,we adopt an improved FP-Tree structure DFP-Tree to store the dataset,then use a kind of effective algorithm DFP-growth algorithm to mine theproximity sequence pattern.
Keywords/Search Tags:Association rules, Sequence, Proximity Pattern, Mining
PDF Full Text Request
Related items