Font Size: a A A

Study On Methods Of Data Mining And Text Mining Based On Fuzzy Logic And Neural Network

Posted on:2006-01-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Q GengFull Text:PDF
GTID:1119360212489268Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Recently data mining and text mining are important research areas in information technology. Applying fuzzy logic theory to data mining and text mining has a great theory significance and practice value. Several methods of data mining and text mining have been studied in this paper. The mainly works are shown as follows:In the algorithm the self-organizing feature maps network (SOFM) is used to determine the membership degree of sample data. Pointing to the main defect of traditional methods of fuzzy clustering is to know the number of clustering in advance, a novel dynamic fuzzy clustering is presented. First, text eigenvectors are acquired based on the vector space model (VSM) and TF.IDF method. Then the number of clustering is acquired by a dynamic self-organizing map, and it is introduced into the fuzzy C mean algorithm (FCM). Finally, the result of clustering is obtained by FCM.The present algorithm possesses much higher precision than the traditional FCM.A new model of dynamic fuzzy Kohonen neural network( DFKCN) is proposed, which is applied to the text clustering. DFKCN adopts the structure of the dynamic self-organization maps (TGSOM) which can determine the number of clustering automatically. A new calculation formula of the learning rate of DFKCN is proposed in the DFKCN. DFKCN uses fuzzy clustering central vectors as the corresponding neuron weights. Both the precision of clustering and the rate of convergent of the network are improved. The model DFKCN is used for Chinese text clustering. Text eigenvectors are represented by using the latent semanticanalysis (LSA), which embodies the semantic relation of the eigen words, and realizes the dimension reduction of the eigenvector.A new fuzzy competitive neuron network clustering (NFCNNC) model is proposed, which is applied to the text clustering. NFCNNC uses the fuzzy central vectors acquired by the fuzzy central clustering (FCC) algorithm as the weights of the neuron network. The winner unit in the model is acquired by comparing the membership degree values between neurons. According to the formula of FCC algorithm,both the fuzzy center clustering vector(s weights of neuron network)and the membership degrees are adjusted, and the number of clustering is determined after the neuron network reaches stable. NFCNNC model possesses simpler structure, higher precision and higher efficiency, and overcomes the defect of that the traditional algorithms need to know the number of clustering in advance.A new fuzzy association rule mining algorithm in text mining and acquisition method of keywords are proposed. When the amount of texts is large, the formula of the support degree of the traditional fuzzy association rule is not suitable. A new formula of the support degree is presented. According to TF?IDF method, the weights of the eigenvectors are calculated. The average weight of the text is used as the threshold value, the eigenvector whose weight is larger than the threshold value is used as the keyword of the text. The weight of the keyword is classified into three attributes: high,middle,low. The fuzzy c means algorithm is applied to clusters the weight of the keywords. NFAR algorithm is presented to mine the association rules of the texts. NFAR algorithm possesses higher efficiency and precision.
Keywords/Search Tags:membership, fuzzy association rule, fuzzy clustering, data mining, text mining
PDF Full Text Request
Related items