Font Size: a A A

Associated With Technology-based Chinese Text Classification

Posted on:2012-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:D S ZhaoFull Text:PDF
GTID:2208330335986412Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the growing explosion of information, it becomes more difficult to obtain useful knowledge, and effective organization and management can help people easily find the exact knowledge, so the text classification has become a major research focus. Due to the complexity of Chinese description and relevant domestic research starts late, varieties of difficulties exist, calling for resolution in the Chinese text classification. Foreign researches have put forward many methods and techniques in the text classification with sound performance, which are at present continually learned in study of Chinese text classification. Plentiful researches of classification accuracy and reliability have been implemented and obtained some practical achievement.Associative classification method a broad and effective application was proposed along with continuous research of association rules and classification algorithms. Based on association technology and with Chinese text set as research object, this paper carries out the research of Chinese text classification technologies. With synthetic consideration of both advantages and disadvantages of traditional methods, this paper intends to design two kinds of Chinese text classification methods. Based upon CMAR multiple association rules, fully considering the property of frequent closet itemset and taking account of TFP which can overcome the trouble of pre-set min_sup, the first method realized concerning improvement to explore the most optimistic closed rules for classification. The second method optimized the CPAR algorithm aiming at its high effective performance in producing potential rules, which adopted new measure rule FGIG to select conjunction, introduced multiple attenuation factor to find the high-quality potential rules, made use of Laplace accuracy rate to effectively evaluate rules, and ultimately integrated homology rules to improve the classification strategy.This paper experimented both two methods on Chinese text classification. Through multiple sets of contrast experiments, the analysis of experimental results proves that these two methods showed sound performance in text classification accuracy. In conclusion, methods studied in this paper bear great practical value and guidance significance in Chinese text classification.
Keywords/Search Tags:Associative Classification, Chinese text classification, Association Rules, Frequent Closed Itemsets, Potential Rules, Evaluation Criteria
PDF Full Text Request
Related items