Font Size: a A A

The Research Of The Gradual Chinese Text Classifying Technology

Posted on:2005-10-15Degree:MasterType:Thesis
Country:ChinaCandidate:H X ZhangFull Text:PDF
GTID:2168360125954755Subject:Computer applications
Abstract/Summary:PDF Full Text Request
K-NN is a method of classifying based on statistics. K-Nearest neighbor algorithm is a kind of common methods in data mining .Its basic idea like this: When there is a discriminating article, the system want to find K nearest neighbors in the exercise set. And then we should find out the class that the most of these K nesrest neighbors belonging to. So the article belongs to this class.K-NN 'algorithm is a kind of indolent study means as it doesn't make real classifier. It is only save all the exercises at first, then picks them out to compute at time when classifying. Compare to active study, when the numbers of the exercise samples increasing straightforward, it will take more and more time to compute. So as to the speed, is more slowly than active study. But as to study, it has dominant position than active study.This article takes advantage of the predominance of the K-NN's in the nicety. At the same time, it contraposes the shortage of the K-NN's in the rate. Then this article puts forward the gradual thinking. When classifying, it uses the text's title. keywords. many important paragraphs . whole text step by step. If we classify successfully by using hereinbefore information, then we enhance the rate of the text classifying. The data from experiments indicate that this methord has higher rate and nicety in classifying.
Keywords/Search Tags:K-nearest neighbor, Gradual thinking, Text mining
PDF Full Text Request
Related items