Font Size: a A A

Research On Feedback Learning In Chinese Text Categorization

Posted on:2010-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:Z G ZhangFull Text:PDF
GTID:2178360272978279Subject:Information Science
Abstract/Summary:PDF Full Text Request
With the increasing expansion of Internet information, Network information resources is growing at a rate of exponential, People must face the question how to discovery and mining information they need resources at a broad range of information. The computer automatically effective text classification methods are explored, in order to improve the classification efficiency and accuracy. But because a limited number of training text gtoups does not cover all the types of text and with the passage of time, that the category has added many new text features makes the original classification outdated. If still using the original classification is to be classified version of the current classification, this may result a series of issues, including error and omissions classification. Thesefore, based on user feedback to dynamic improve the classification performance become a urgent issues.On the basis of extensive research on text classification, the key technologies of text classification, including word text, text expression, feature extraction and selection, feature weight, classification algorithm (especially SVM classification and KNN classification), category performance evaluation, is described in detail. Based on difference size text set, The impact of IG,MI,CE,CHI and WE to classification performance, The impact of Kernel function selected in svm algorithm to classification performance, the impact of feature vector dimension to classification performance, the impact of text feature extraction and selection to classification performance, the impact of the determine of k value in knn algorithm to classification performance, etc, these experiments are done and the results are a father analysis.In the base of the study of Chinese text Categorization, The introduction of relevance feedback in Chinese Text Categorization is finished, the text classification study the basic idea of feedback is detail analyzed, Feedback study the classification of flow and feedback learning algorithm are deeply discussed. The Chinese text classification model based on feedback was constructed. The feedback framework and function modules of the Chinese Text Categorization are formulated. At last, through the training set and non-training set, respectively, empirical study shows that Feedback study on the classification performance is significantly enhanced and the impact of the quality of training samples for the study to classification performance is important.In the base of the model of Training - Classification increased feedback study, the Chinese text classification model of Training - Classification - Feedback is formed. The model have perfect role to original model of inadequate training and update faster. The classifier has a obvious change from inadequate training to training in full and Classification performance temporarily stabilized. So, Research on Chinese Text Categorization based on Feedback Learning has strong theoretical and practical significance.
Keywords/Search Tags:Support Vector Machine(SVM), K-Nearest Neighbor(KNN), Text Categorization, Feedback Learning
PDF Full Text Request
Related items