Font Size: a A A

Research Of Bayesian Logic Program Model Realization

Posted on:2008-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:L N LiFull Text:PDF
GTID:2178360212996237Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, there has been an increasing interest in first order probabilistic models. it uses first order logic as knowledge representation language, and applys probability to model uncertainty , it is an effective framework for figuring real world problems. Bayesian logic programs combines Bayesian networks with definite clause logic, is a simplification of similar models, it can express domain problem more compactly. as first order extensions of Bayesian networks, BLP conquers proposition limitation of Bayesian networks, simultaneously possess ability to express uncertainty. Bayesian logic programs has both logic programs as well as Bayesian networks as an immediate special case, so it is easy to understand and apply. Bayesian logic programs can also handle domains involving structured terms as well as continuous random variables. for domains with many objects and complicated relationships between them, BLP is a suitable model.Presently, most realizations of BLP base on first order rules mining algorithm in inductive logic program. consider the hugeness and the extreme complexity of first-order rules space, in order to carry out an effective search, most ILP systems have used greedy search strategy and required to explicitly present strong language bias related to the mining task, to reduce the search range or is used as heuristics to guide the search. The greedy search strategy may let search trap in a local maximum, on the other hand, we need more powerful method to handle with mining problems with large search space.With such backgrounds, this paper launches the research work and presented one kind of realization of BLP model, which is named LBLP. LBLP consists of two components: firstly a logical one, a set of Bayesian clauses, and secondly a quantitative one, a set of conditional probability distributions and combining rules corresponding to that logical structure. During the process of rule mining, this paper adopt particle swarm optimization (pso)algorithm as the search strategy. When studying for the notion of a target concept, LBLP accomplish following steps: First, pretreatment of dataset. this paper designs a small compiler, it handles datasets and deposits them into a domain. LBLP defines on the datalog set, that is only constants or variables appear in a predicate. so program gives defines of variable type,predicate,constant,mapping between type and constant, then parsers following syntax rules, valid datasets are stored into a domain.Second, decode. confirm the maximum of sequence number of different variables appearing in explored rules. consequently set of predicate templates are determined, they form the whole problem research space. predicate template is kernel of clauses ,several templates form one clause. in pso algorithm, one particle denotes body of a clause, particle swarm denote clause space, the length of a clause is dimension of particle.Third, discovery process includes two algorithms: the inner algorithm, which is the clause discovery algorithm, has for its task to find and return the clause, which better classifies the predominant class in a given instance set, it is here that the pso algorithms are used. the outer is the covering algorithm, it is basically a divide-and-conquer technique. given a instance training set, it runs the clause discovery algorithm in order to obtain the highest quality clause for the predominant class in the training set. correctly classified instances are then removed from the training set and the clause discovery algorithm is run once more. Iteratively a sequential rule set is built, and the covering algorithm runs until only a pre-defined number of instances are left to classify.Fourthly, calculate fitness. this paper presents three functions: one uses the covering number of target positive examples as the value of fitness, second uses the ratio of positive examples and total examples, third is a function of positive examples,negative examples and length of clause. Fifthly, post-treatment of clause sets, delete repeated and irrespective predicates from clause.During the process of getting qualitative description, assign associated conditional probability densities to each clause, and assign corresponding combining rules to predicate. The method to calculate conditional probability densities are:First, in statistics, if the number of samples is enough, probability can be replaced by frequency of occurrence, this paper apply this method to calculate conditional probability densities when data are integrated.Second, realize a first-order extension of gradient ascent method, and get conditional probability densities when data are un-integrated. This method uses a function over the observed data and probability parameter, it defines occurrence probability of dataset at a certain parameter point, aims at finding parameter which maximize probability value.Third, for clauses which have equal head, this paper sets combining rule of noise-or on it. thus, Logic structure and qualitative component are formulated, LBLP realizes a BLP model.This paper realizes the query-answering procedure, it adapts two phase strategy of knowledge-based model construction method: for query which has no evident, construct the support network of query variable, then calculate prior probability; for query with evident, construct the union network of query variable and evident variables, then apply an inference algorithm on the union network to calculate probability.This paper carries out experiments on one available databases: the UW-CSE database used by Richardson and Domingos. and tests the performance of pso algorithm,gradient ascent method and so on, gives experiments result of studying process. this paper use tenfold cross-validation method to compare pso algorithm with genetic-based rule mining method, tests indicate that pso algorithm can obtain clause set effectively, and has predominance on covering speed.
Keywords/Search Tags:Realization
PDF Full Text Request
Related items