Font Size: a A A

Study And Implementation Of Spark-based Algorithms For Constructing Concept Lattices

Posted on:2018-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2348330521451515Subject:Engineering
Abstract/Summary:PDF Full Text Request
Formal concept analysis,also known as conceptual lattice theory,was proposed by Wille R in 1982,which expresses concepts and the level of concepts in a mathematical form and has been widely used in many fields,such as knowledge engineering,information retrieval,software engineering and so on.Qi et al combined three-way decision theory and concept lattice theory and proposed three-way concept analysis.On the one hand,the three-way concept analysis theory is the extension of FCA.The three-way concepts can express the semantics ‘‘jointly possessed” and ‘‘jointly not possessed” in a formal context,which expresses more detailed information than the classical concept.On the other hand,the three-way concept analysis also provides a more concrete model for three-way decision theory.According to the definition of three-way concepts,the attribute domain or object domain is divided into three parts,and then three-way decisions are made.As a means of data analysis,the first job of FCA is to construct the corresponding concept lattice from a formal context.In general,the number of concepts in the concept lattice and three-way concept lattices is growing at an exponential level.In this way,the efficiency of the algorithm for constructing concept becomes the key to determine whether the concept lattice theory and three-way concept theory can be applied successfully in practice.Traditional non-distributed algorithm for constructing concept lattice generally only deal with smaller data sets,can not meet the applications of large data which is developed rapidly.Focusing on the classical concept as well as three-way concept lattice,this paper designs and implements the distributed algorithm based on Spark platform.First of all,this paper reviews the definitions in formal concept analysis and three-way concept analysis,and some serial and distributed algorithms for constructing calssical concept lattice.A distributed algorithm for constructing classical concept lattice is designed by learning the basic idea of Cbo algorithm and implementted on Spark platform.Cbo algorithm uses a depth-first search strategy,but recursive task in the cluster computing is not easy to split into a number of tasks which is distributed to each node in the calculation,so the search strategy in Cbo algorithm does not apply to distributed algorithms.In order to solve this problem,this paper reform the Cbo algorithm,and transform the recursive operation in the Cbo algorithm into iterative operation in using the breadth-first search strategy to fit the distributed computing framework.Based on the basic idea of Cbo algorithm and the nature of the three-way concept lattices,the distributed algorithm for constructing three-way concept lattice in an iterative way is proposed in this paper.A series of transformations operation and actions operation of RDDs provided by Spark generate all the concepts and complete the pruning work in the calculation process,and then the algorithm for constructing three-way concept lattice is achieved based on Spark.Finally,the experimental algorithm is analyzed in this paper.In respect of the classical concept lattice,in this paper,the algorithms based on Spark platform shows a certain efficiency improvement in the analysis results comparing to the algorithms based on Hadoop platform.On the other hand,for the three-way concept lattice,based on k-uniform formal context,compared the algorithms based on Spark platform with the serial algorithm,the experimental results show that the distributed algorithm is more efficient.
Keywords/Search Tags:formal concept analysis, three-way concept analysis, Spark, Map-Reduce
PDF Full Text Request
Related items