Study Of Sequential Mining In Database

Posted on:2002-04-17

Degree:Master

Type:Thesis

Country:China

Candidate:L Chen

Full Text:PDF

GTID:2168360032457008

Subject:Computer applications

Abstract/Summary:

PDF Full Text Request

In face of the soaring amount of information,people are intimidated by "Data Bomb"while they fall into the fear of "shortage of knowledge".KDD coming for the need has become one of the strongest weapons that people can use to solve the paradoxical problem. Data mining is the nontrivial process of identifying valid,novel,potentially useful, and ultimately understandable patterns in data.Algorithm is the key part in KDD,for this reason to research for efficient. On one hand ,Data Mining is used to process Large database, and so the efficiency of algorithm is the most important; On the other hand the computer in use is not satisfying to the processing of Large database. Consequently , we should modify present algorithm to fit the need above. On account of all above, this paper chooses algorithm of Data Mining as the research.This paper deeply researches the Sequence Mining Algorithm. Mohammed J.Zai puts forward the algorithm SPADE. Comparing with the Apriori, SAPDE has the feature of less time-consuming and because of its vertical storage way the time that database is scanned will be much less and so it is more efficient. In spite of the advantages above, SPADE has its own disadvantages as below:1.It will produce a large number if candidate sets in the processing;2.The frequent sequence that SPADE produces are limited to special item. To make up the deficiencies above, this paper put forward the Middle Matching algorithm, whose main idea is to produce candidate sets through matching two sequences in the middle place to reach the aim of reducing the number of candidate sets, meanwhile processing in that way can avoid the second disadvantage above to expand the domain in which the algorithm can be used.This paper compares difference of the candidate sets between two algorithms. The experiment proves that the Middle Matching algorithm is more efficient than SPADE.The algorithm researches in this paper can be used for sequence mining or searching frequent sequences related to time. At the last part of this paper ,an example of time series-based stock mining is taken to assessing the Middle Matching algorithm.

Keywords/Search Tags:

Data Mining, Support, Middle Matching Algorithm, Candidate Set, SPADE

PDF Full Text Request

Related items

1	Research And Application Of Character Sequence Pattern Mining Algorithm
2	Association Rules Candidates To Support The Study Of The Frequency
3	Data Mining Technique Application Study On Logistics System
4	Realization Of Data Mining On Machinery Enterprise Marketing System
5	Design And Implementation Of The Phone Virus System Based On Sequential Patterns Mining
6	Data Mining Association Rules In The Research And Application
7	Research On Multi-candidate Orthogonal Matching Pursuit Improved Algorithm Based On Compressed Sensing
8	The Applied Research On Data Mining In The Library Management System Of The Middle Schools
9	Design And Implementation Of The Sequential Pattern Mining Algorithm GSP
10	Study On Association Rules Algorithm And Application For Data Mining