Font Size: a A A

Research On Methods Of Learning Statistical Relational Model

Posted on:2009-05-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:P YuFull Text:PDF
GTID:1118360245963173Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
It is an important artificial intelligence subject by computer program automatically learning useful information all the while. Data usually has complicated correlation, however the most current learning algorithms suppose the data is independent and identically-distributed (iid). So they are just only fit for simple objects and deal with the'flat'data form. In real life, the data are non-structure or semi-structure except the structure data. For learning useful information from multi-relational data, Statistical Relational Learning provides a powerful method for dealing with them.Statistical Relational Learning (SRL) is the integration of probabilistic reasoning, logical representation and machine learning (or data mining). It is fit for obtaining the hidden knowledge from complicated multi-relational data. It is important in many application areas. Since 2000 years, statistical relational models learning is regarded by many domestic and overseas scientific research institutions, some important international conferences (such as ICML-2004, IJCAI-2003, AAAI-2000 etc.) have attached importance to SRL and regarded it as an important subject. So nowadays, statistical relational model learning has become an increasingly active area of research in data mining and machine learning.Nowadays, Statistical Relational Learning methods can easily fall into local optima etc. The nondeterministic methods, such as GA and immune algorithm etc are suitable to resolve these problems. For the relational structure of statistical relational model are all based on first order logic, learning first order logic structure usually encounters the large search space, so how to reduce the search space is the sticking point of improving the efficiency and precision of learning first order logic structure. Otherwise, the research on learning statistical relational model can extend the research area of SRL and make for the application of SRL.Based on statistical relational learning, ILP, GA, immune algorithm, PSO, the thesis conducts a research on learning statistical relational model from relational data. By analyzing and comparing the current methods, find the deficiency and discover new question, proposes some new methods further which are more efficient. The research lines are as follows: based on analyzing the characteristic of relational data, current research methods of SRL. The thesis refers and expands the classical methods, focuses mainly on the method of reducing search space in statistical relational model learning, expanding the description language's expression ability on condition that the background knowledge is insufficient, the method of learning statistical relational model from incomplete relational data and a unifying learning method of aiming at every kinds of statistical relational models, put forward learning algorithm based on template and nondeterministic methods. Besides, experimental results are provided to show the effect of these algorithms.The main results and contributions in the thesis are as follows: (1). The thesis summarizes the research content and status of statistical relational learning, mainly focuses on the conception, characteristic and research status of every kinds of statistical relational model, summarizes the advantages and characteristic of SRL dealing with traditional complicated relational problems. We analyse and summarize SRL by dividing the SRL into three parts: machine learning, probabilistic reasoning and logical presentation. We classify the method of statistical relational learning from the view point of various methods based on model extension and relational classifying algorithm, and briefly illuminate every method. Finally, we summarize the research and development of statistical relational learning and existing problems in the future. These discussion and analysis on these basic conceptions must be the theoretical basis of our further researches.(2). The thesis analysed and summarized the characteristic of relational data. We divided the data into complete data and incomplete data to discuss and compare the relational data with the traditional attribute-value data, so it can stand out the characteristic of the relational data. We list the main characteristic of the relational data: concentrated linkage, relational autocorrelation and degree disparity. We list an indirectly influence characteristic which was found in model learning experiment. After analysis these characteristic, we summarize the advantage and disadvantage of every characteristic acting on learning models. For the disadvantage, we summarize the current resolving methods, so it can enlighten the researcher.(3). We analyse and summarize some current statistical relational model structure learning algorithm. Aiming at the large search space when learning the structure of statistical relational model, we put forward a clause learning algorithm based on template and define clause template as a middle structure in the algorithm. It firstly learns clause template by genetic algorithm, then convert the clause template to clause by combining tag metrix and information gain sampling. For it replaces search clause by search clause template, the algorithm reduces search space effectively. We design the corresponding fitness function and genetic operators. The theoretical analysis and experiment comparison show this algorithm can improve learning efficiency, and has the ability of learning recursion clause, is an effective clause learning algorithm.(4). By applying the clause learning algorithm based on template to learn the structure of Markov logic network (MLN), we reduce the search space when learning MLN and can get compact result. After the structure learning, we use PSO to learn the parameter of MLN. The theoretical analysis and experiment comparison show this algorithm can learn the better result.(5). Aiming at the database don't contain sufficient background knowledge, we put forward a clause learning algorithm which can invent new predicates from relational database by combining the clause learning algorithm based on template with immune mechanism. It make use of the fluctuation of the fitness function to judge if the result space lack predicates, once it needed, the algorithm will invent new predicates by combining the current best result and making use of the fact that clause head covering positive examples and negative examples to invent the predicate needed, so it can satisfy the need of learning. The experiment shows this algorithm can expand the description ability of background knowledge. (6). Learning statistical relational model from relational data, which contains missing values, has realistic significance. However, this field is seldom studied currently. By getting MLN as target, we put forward a MLN learning algorithm MEM, which can learn MLN from relational data, which contains missing values, by expanding EM algorithm with the clause learning algorithm based on template. We define the relational missing data, design the algorithm of constructing initial MLN and completing the relational missing data. Both theoretical analysis and experimental results show that MEM can effectively learn MLN from relational missing data.(7). Aiming at the different model structure in current SRL, we discuss the common characteristic of these models, analyse the feasibility of using the unifying learning method on these models. We put forward a unifying learning method by combining previous research work. This method gives reference meaning for every kinds of statistical relational model learning, is in favor of recognizing the essence of SRL, and makes for enriching the application of SRL. At the same time, the method also makes for enlightening the research of the other machine learning field.The study result of the thesis enrich statistical relational model learning algorithms, and these researches are of theoretical and practical benefit to further studies of statistical relational model learning.
Keywords/Search Tags:Statistical Relational Learning, Statistical Relational Model, Relational Database, Statistical Relational Model Learning, Parameter Learning, Struct ure Learning, Complete Relational Data, Incomplete Relational Data, Expectation
PDF Full Text Request
Related items