Font Size: a A A

Classification Of Uncertain Data Based On Nearest Neighbor

Posted on:2016-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:H Y LangFull Text:PDF
GTID:2308330470468723Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In spite of the great progress in the data miming field in recent years,the problem of handing uncertain data has remained a great challenge for data mining algorithms.The purpose of classification is to analysis the input data, through the training data shows features, for each class to find an accurate description or model.This description is often represented by predicating.The resulting class description is used to classify the test data.Although the label of these test data is unknown, we can still predict these new data belonging to a class.Attention,it is the forecast, and unknown for sure.Traditional classification techniques deal with feature vectors having deterministic values,data uncertainty is usually ignored in the learning problem formulation. However, it must be noted that uncertainty arises in real data in many ways, since the data may contain errors or maybe only be partially stored.How to efficient process with uncertainty of error data is still a challenge in the field of data mining.Traditional machine learning algorithms often assume that the values of data are exacter precise,In many emerging applications,however,the data is inherently uncertain.Sampling errors, instrument errors and privacy preserving are both sources of uncertainty, and data are typically represented by probability distributions rather than deterministic values.Recently some traditional classification algorithms have been extended to handle uncertain data classification problems, such as decision trees, support vector machines,etc.However, this work deals with the problem of classifying uncertain data. In view of the traditional classification method in dealing with uncertainty data may return the worry class if the probability for the object to belong to that class approaches zero.With this aim we introduce the Uncertain Nearest Neighbor rule, which represents the generalization of the deterministic nearest neighbor rule to the case in which uncertain objects are available. The rule relies on the concept of nearest neighbor class, rather than on that of nearest neighbor object. The nearest neighbor class of the test object is the class that maximizes the probability of providing its nearest neighbor. The experiment proved that the former concept is much more powerful than the latter in the presence of uncertainty. An effective and efficient algorithm to perform uncertain nearest neighbor classification of the test object is designed. Experimental results are presented, showing that the rule is effective and efficient in classifying uncertain data.
Keywords/Search Tags:nearest neighbor rule, nearest neighbor class, uncertain data
PDF Full Text Request
Related items