Research On Networked Data Classification Based On Active Learning

Posted on:2016-11-26

Degree:Master

Type:Thesis

Country:China

Candidate:H H Xu

Full Text:PDF

GTID:2308330464453269

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of information technology, it has produced a variety of complex networks, such as social networks and biological networks. The wide application of these networks has generated large amounts of networked data. Classification in networked data is recently an important issue in the field of machine learning and data mining and has been widely concerned and researched. For classification of networked data, not only considering how to utilize the characteristics of networked data, but how to exploit the links between networked data. It has a big difference with the traditional data classification. Therefore, classification in networked data is an important and urgent issue to research.An intensive study on active learning techniques and the networked data classification is conducted in this paper. Propose a framework of active learning for networked data classification and several different specific active learning approaches for classification in networked data. The whole work of this paper is as follows:(1) For the problem of sampling strategy in active learning, this paper analyzes the existing the measurement of uncertainty, representativeness and diversity criterion and explores the effect of specific measurement criterion in networked data classification. This provides a theoretical basis for the following study.(2) Different sampling strategies measure the instance value from a different point of view. The contribution degree of sampling strategies is not the same for selection of instance. To effectively exploit the link between networked data, this paper proposes an adaptive active learning method for networked data classification which integrates several different sampling strategy criterions. This method can dynamically adjust the various criterion weights to effectively estimate the instance information content, and finally select the high-valued instances to label.(3) Batch mode active learning method selects multiple instances for labeling at a time. In order to fully take the networked data characteristics into account, build the instance correlation matrix by measuring the instance uncertainty, representativeness and diversity. Based on the correlation matrix, this paper proposes a batch mode active learning method for networked data classification based on optimal instance subset, which can select an optimal subset instances to be annotated by an oracle, and can ensure that the classifier has a better classification performance.Several networked datasets are used to conduct experiments to verify proposed methods in this paper. The experimental results are analyzed to show the effectiveness of the proposed approaches.

Keywords/Search Tags:

Active Learning, Networked data, Batch Mode, Correlation link, Sampling Strategy

PDF Full Text Request

Related items

1	Research On Batch Active Learning Algorithm Based On Generative Adversarial Networks
2	Research Of Sampling Strategy In Active Learning Algorithms
3	Batch Mode Active Learning For Exploring Structure Of Networks
4	Research On Classifiers For Data Streams Based On Active Learning
5	Research And Application Of Active Learning In Batch Mode
6	Analysis And Optimization Of Networked Control Systems Based On Hyper-Sampling Mode
7	Active Learning For Image Classification
8	Fault-tolerant Control For Networked Batch Processes Based On Iterative Learning Method
9	Research On Image Annotation Based Active Learning
10	Research On Active Learning Algorithms In Continual Learning Framework