Font Size: a A A

The Research Of Semi-supervised Learning And Application In Police Platform

Posted on:2011-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:M M HuangFull Text:PDF
GTID:2178330332460846Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of information age, data collection and storage competency is greatly improved. Discovery and extraction of useful information from the mass data becomes common requirement of information field. Traditional supervised learning and unsupervised learning are not solve such problems efficiently,because supervised learning only uses a small number of labeled samples for training, wasting a lot of information hidden in large of unlabeled data. And unsupervised learning only clusters unlabeled samples, labeled data are wasted. Semi-supervised learning which can solve this problem, using a large number of unlabeled data to assist insufficient labeled data for learning attracts scholars'attention in machine learning and data mining area.In practice, during collecting data, noise may occur. This paper discusses the robust semi-supervised learning algorithm, solving how to improve the learning performance if there exits noise.First, analyze a typical Gaussian-Laplacian regularization algorithm which is based on least square criterion and is sensitive to noise, combined with the correntropy criterion, this paper proposes a robust semi-supervised learning algorithm based on maximum correntropy criterion along with its convergence analysis.In this algorithm, the objective function uses Welsch M estimators to replace the original objective function of least squares: As the proposed objective function is nonlinear, the solution space is nonlinear and difficult to solve. Then half-quadratic optimization algorithm based on the local greedy algorithm is proposed which simplifies the correntropy optimization problem to a standard semi-supervise d problem in each iteration.Promising experimental results on the UCI databases and face databases demonstrate the effectiveness of our method in the mislabeling noise and image noise.The proposed algorithm is applied to Dalian public security platform.This project is about full-text retrieval on police database.The semi-supervised learning algorithm is applied to the pre-processing module, which divids Chinese texts into different themes.Then retrieve in one theme which just composes some texts.What we do can reduce the search text quantity, thereby reducing the retrieval time and improve system availability.
Keywords/Search Tags:semi-supervised learning, robust, correntropy, full-text retrieval
PDF Full Text Request
Related items