Font Size: a A A

Text-oriented Disciplines Correlation Analysis Association Rule Mining Technology Research

Posted on:2012-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:G RenFull Text:PDF
GTID:2208330335990662Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Text Mining is the hot research of the modern information processing. In the process of text mining, the text data pre-processing and the association rules extraction play an important role. In that case of the open question of the pre-processing about text mining and the general characteristics of the project application in the evaluation system, this paper put forward a new method of association rules analysis based on the improved max weighting of keyword feature. On the basis, this paper pays some attention on the reprocessing of the association rules.Based on the characteristics of the project application of science and technology, the establishment method of association rules extraction for the text data is researched and designed. At the same time, an effectual description of the text eigenvectors is used. Besides, an improved method for keywords'feature weighting is introduced by analyzing the traditional text feature selection in the TF-IDF algorithm. In this method, the weight computing based on the information domain of field keywords is proposed to implement the valid selection of text feature for the relevant disciplines in the project application of science and technology. Then, the validity of this method is verified by experiments.For the huge dimension of the Chinese text vector indexing and the complex to produce the frequent set, a scheme is proposed to resolve the problems in the process of the traditional association rules extraction which is based on the XML format and the maximum feature weighting. This scheme could bring great convenience both in the text data storing and computing in the whole process of the text mining. For the characteristics of disciplinary correlation analysis, a post-processing technique for the association rules extraction is used, which is based on the keyword co-occurrence of the subject field. With calculating, co-occurrence between the keyword and the typical words for the subject field, we could discover the new research issues or the blank ones in the field of interdisciplinary.
Keywords/Search Tags:Text Mining, keyword-weight, association rules, interdisciplinary, co-occurrence
PDF Full Text Request
Related items