Font Size: a A A

Improved Classification Algorithms Based On KNN And All-confidence Pattern

Posted on:2011-07-31Degree:MasterType:Thesis
Country:ChinaCandidate:L Z ZhangFull Text:PDF
GTID:2178360308474015Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining is a complex process which extracts the implicit,previously unknown,valuable and useful knowledge from the database.Classification mining is one of the important tasks of data mining,which have been in wide applications,such as medical diagnosis, speech recognition, face recognition, radar probe and so on.Different classification algorithms which were designed for different problems has their advantages and also has some locations.They can abtain different accuracy for different data. We propose some improvements on KNN and classification algorithm based on association rules.Traditional KNN algorithms only consider the number of nearest neighbors, or average similarity.When the number of the nearest neighbors was large,but the similarity between the smaples of this category and the unknown sample was relatively small, it will appear the false judgement.When the number of the nearest neighbors or the average similarity was the same,it will be unable to abtain the result. Then an improved KNN algorithm based on contribution of attribute value,average similarity and the number of the similar samples is proposed in this paper. We perform the experiments on mushroom data sets. The experimental results show that our approach has higher accuracy than the traditional K-nearest neighbor algorithm.Not all the patterns in the frequent pattern minging was the interesting patterns for the users.When the pre-specified thresold of the minimum support was too low, the number of the patterns was too large and the rules which was extacted from the patterns was too much. In addition, the correlation between the atrribute values in the frequent patterns was small.The paper includes a classification algorithm based on all-confidence patterns. We not noly consider the independent occurrence of a simple attribute value,but also take the simultaneous between the multiple attribute values which has high accociations between them into account. Similarly,we perform the experiments on mushroom data sets.The results inproved that our approach has higher accruacy than rough set algorithm and tha algorithm based on frequent pattens.
Keywords/Search Tags:KNN algorithm, similarity, confidence, all-confidence pattern
PDF Full Text Request
Related items