Font Size: a A A

Research On Sequence Fuzzy Concept Lattice Model And Distributed Treatment

Posted on:2010-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y H YuanFull Text:PDF
GTID:2178360275496304Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of socio-economic and science technology, information technology acquires extensive applications. Many fields have accumulated abundant data; therefore, people need a new technology and tool to help them discover the important and valuable information. In this case, data mining technique is presented to solve the about-mentioned problem. As an important research of data mining, sequence pattern mining has gained lots of researches. At present, many scholars at home and abroad have presented various algorithms to detect sequence patterns. For example, AprioriAll algorithm, SPADE algorithm and PrifixSpan algorithm, and so on. And sequence pattern mining has been widely applied in the action analysis of customer buying, analysis of network visiting, analysis of DNA sequence patterns, and other fields.However, current algorithms for mining sequence patterns are only able to discover the frequent sequences satisfying the minimum support threshold minsup. Nevertheless, these methods don't consider the importance of sequences and items, i.e., most of the frequent sequences which above-mentioned algorithms detect may be unimportant for users'attention. Though some sequences don't satisfy the minimum support threshold minsup, they are very valuable for users. Contrarily, some sequence patterns are not so important for users'demand that they don't need to be detected. Therefore, it is very essential for algorithms mining sequence patterns to adjust self-adaptively to detect sequence patterns of satisfying users'require. However, current existing mining algorithms don't take account of the feature not to mine the frequent sequences meeting users'needs.Because the concept lattice in formal concept analysis only scans once in database to successfully construct lattice structure. And it may strongly describe knowledge and hierarchical relation between conceptions. It only needs to store the maximal common subsequence in concept lattice. As a result, concept lattice may reduce the generation of redundant sequences to save much space. Therefore, this paper conducts a systematic study for the combination of sequence pattern mining and fuzzy concept lattice. The main research results are as follows:(1) For current algorithms for building concept lattice still consume much time to generate concepts in the large-scale sparse data sets or distributed data sets, this paper presents a new algorithm called IETreeCS(Concept Sets based on Intension and Extension Tree) based on the IE-Tree(Intension and Extension Tree) and the characteristic space partition. The IETreeCS algorithm firstly defines an IE-Tree, then it converts formal context into IE-Tree to reduce the storage of data sets, i.e., the IE-Tree can reduce the storage of the formal context. And then the IETreeCS algorithm gives a definition about the characteristic space and describes how to partition a characteristic space on the basis of the IE-Tree. At last, the paper presents the integrated IETreeCS algorithm that generates all concepts from a binary relation. The experimental results show that the IETreeCS algorithm performs more efficient than the NextClosure and SSPCG algorithms from the large-scale sparse data sets or distributed data sets. Simultaneously, IETreeCS algorithm provides algorithmic support for constructing sequence fuzzy concept lattice.(2) For the sake of organizing and mining valuable sequence patterns satisfying user's various requires, this paper presents a new sequence fuzzy concept lattice model. Based on traditional fuzzy formal context, we extend it to express sequences in brief, and give out a definition of sequence fuzzy formal context. Making use of the sequence fuzzy formal context, Galois connection, sequence fuzzy conception and sequence fuzzy concept lattice are defined in the paper. At last, this article presents the incremental construction algorithm SeqFuzCL (Sequence Fuzzy Concept Lattice) of the sequence fuzzy concept lattice. The experimental results show that the algorithm SeqFuzCL can effectively express self-adaptive sequence patterns in the lattice, and has excellent performance on the time-spatial complexity. Simultaneously, the model provides theoretic support for mining self-adaptive sequence patterns.(3) Actually, a large number of large database are stored in the distributed system. Thus, sequences also exist in the distributed condition. In order to handle distributed sequence efficiently and conveniently, thie paper puts forward a distributed sequence fuzzy concept lattice model on the basis of sequence fuzzy concept lattice and presents a building algorithm DseqFuzCL(Distributed Sequence Fuzzy Concept Lattice). In distributed sequence fuzzy concept lattice, algorithm DseqFuzCL can not only discover distributed sequential patterns, but also mining specific distributed sequential patterns satisfying user's various demands, for instance, weighted sequence patterns. The experimental results show that algorithm DseqFuzCL has excellent performance on the time-spatial complexity.(4) On the basis of sequence fuzzy concept lattice, this paper define self-adaptive coefficient and self-adaptive sequence patterns SASP(Self-adaptive Sequence Pattern), and then we present a new algorithm SASeqP(Self-adaptive Sequence Pattern) for mining self-adaptive sequence patterns based on sequence fuzzy concept lattice. Algorithm SASeqP may dynamically adjust the minimum support threshold minsup by self-adaptive coefficient to discover very valuale sequential patterns for users.
Keywords/Search Tags:FORMAL CONCEPT ANALYSIS, CONCEPT LATTICE, FUZZY FORMAL CONTEXT, FUZZY CONCEPT LATTICE, DATA MINING, SEQUENCE PATTERN MINING, DISTRIBUTED SEQUENCE PATTERN
PDF Full Text Request
Related items