Font Size: a A A

Learning Bounds And Applications Of Relational Classification Model

Posted on:2016-09-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:X WangFull Text:PDF
GTID:1108330503969583Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The rapid development of the human information society, resulting in a flood of information. This information has intrinsic complex interrelationships, such as social network and World Wide Web. This information not only has a complex structure, but also have hyperlinks, references linked to each other. Although the human can use machine learning methods to get valuable patterns and knowledge from the massive information,to solve various social and life problems. However, the most common statistical machine learning method often overlooked the relationship between the data which contains deeper semantic information. The Model learning is based on the assumption that the data are independent and identically distributed, which lead to a poor fitting effect between the learning model and the data.To deal with these complex situations, researchers have proposed Statistical Relational Learning(SRL). SRL, also known as probabilistic logic learning, which aims to get a relational data likelihood model, pattern, specific information, knowledge, and then to reasoning, prediction, classification. SRL combines the knowledge base and the probability model, which can solve the complex problem. Currently, SRL has become an important research field and has been applied successfully in the area of bioinformatics,systems biology, natural language processing, Web dig data, social network analysis.This paper studies the statistical relationship model in classification. The statistical relational learning model for classification model called Relation Classification model(RC model). Relationship classification model in the learning process is affected by the relationship between the samples. Some studies experience showed that when the link between the relational data with high auto-correlation value, the RC model classification result will be better than the traditional classification model. However, the learning bounds of RC models have relatively few study. Researchers control the learning process of RC model only by experience. It will result in a poor generalization performance of the RC model. Therefore, it is necessary to conduct in-depth research on this issue,deep understand the relationship between classification learning process, and optimize the learning process. Besides, the network security problems such as network situational awareness and internet opinion analysis have high correlation data. So it is very necessary to use the relational classification model in these challenges. The main innovation of this paper include:1. For the problem of RC model lacks an accurate complexity measure and general learning bounds, we proposed a new complexity namely relational dimension to measure the linking ability of relational classification model. The relation between the complexity and growth function is proofed, and the learning bound for finite VC dimension and relational dimension is obtained. Afterward, we analyzed the condition of learnable and non-trivial, and the feasibility of the bound. Finally, we analyzed the learning progress of relational classification model that based on Markov logic network, and give some examples. The experiment on a real dataset has demonstrated that the bound is useful in some practical problem.2. For the problem of RC model lacks a stability measure and stable learning algorithm. We derive a learning bound with a new measure dependence stability and a limited Vapnik–Chervonenkis(VC) dimension. Based on the learning bound, we then design a stable learning algorithm. Applying a Markov logic network on synthesized and realworld datasets, our experimental results demonstrate that our bound can be tight if the RC model has appropriate dependence stability and limited VC dimension. Our learning algorithm increases the stability of RC models while reducing the deviation between empirical risk and true risk.3. For the problem of multi-domain relational transfer learning, we proposed an algorithm. The algorithm hybridizes and creates new knowledge, which is formalized into an uncertain hypergraph. Then, we proposed a method to mine frequent sub-hypergraph from the uncertain hypergraph(MFS-UHG). The frequent sub-hypergraphs are pivot knowledge, which has to be transferred with high priority. We embed the pivot knowledge in the progress of MLN structure learning. The experimental evaluation on four domain datasets shows that the algorithm outperforms state-of-the-art MLN-based transfer learning.4. Above theory and algorithms are used in network availability estimates, network opinion leaders recognize, etc. The effectiveness of our learning theory and algorithm in the network security problems is verified.
Keywords/Search Tags:Relational Classification, Learning Bound, Multi-task Learning, Transfer Learning, Network Availability Estimates, Network Opinion Leaders Recognize, Network Public Opinion Tendency Analysis
PDF Full Text Request
Related items