Font Size: a A A

The Research On Distributed Data Mining In Electronic Commerce Environment

Posted on:2008-12-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:X G YuFull Text:PDF
GTID:1118360215992271Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The explosive growth of Electronic Commerce makes the resources and services more plentiful with current fire-new business on internet, everyday these rich resources and services generate volumes of heterogeneous, uncertain, and unstructured data which so complexity that far beyond human's current capacity to interpreting and digesting them. Hence, it is practical important to develop some new techniques for making use of these valuable complexity resources sufficiency. Data mining is a technique that aims to analyze and understand large source data and reveal knowledge hidden in the data. It has been viewed as an important evolution in information processing.Key algorithms and architecture are researched to solve the core problems of distributed data mining in Electronic commerce environment after distributed data mining, Web services and Agent technology are analyzed and summed up in the paper.Firstly, an adaptive distributed algorithm called P2PAKNNS for P2P k-nearest neighbor search in high dimensions is proposed to solve the shortcomings of KNNs in Electronic commerce environment. Metric Space, Similarity Queries and Principles of GHT~* are discussed. Similarity measure function HDSF(X,Y)is given. Insert, Range find and Search Algorithms in GHT~* are discussed. The detailed P2PAKNNS algorithm is given and discussed with experiment.Secondly, DENCLUE is analyzed. In order to solve its shortcomings and be applied in Electronic commerce environment, a new clustering algorithm of distributed data mining based on P2PAKNNS and DENCLUE called KNDC is proposed for Electronic commerce environment. Fuzzy Clusters division, parameter k, parameterσandξare discussed, the detailed algorithm is given and discussed with experiment.Thirdly, these association rules algorithms of data mining are introduced detailedly, they are Apriori algorithm, relative support Apriori algorithm, mean itemset divide method, the method of threshold's settings is improved. RSAA-BOUIGA algorithm is proposed to improve the precision and efficiency of valuable rare data mining according to BOUIGA and RSAA algorithm.Then, after reviewing former research, we combine the solution of industry and academy, a new architecture called BWADM based on the researches above is proposed, it is a distributed data mining system based on Web services and Agent in Electronic commerce environment. Web service composition rules and execution of Web service composition are discussed. These modules are introduced detailedly in Electronic commerce environment; there are algorithm management module, control center module, algorithm database module and model representation module.Lastly, a prototype system of distributed data mining is given in Electronic commerce. The logic structure of data mining system based on Web services is proposed according to Web services technology; the project of data mining based on Web services is designed and implemented. These prove the reasonableness of BWADM and above these algorithms are more efficacious and precise than current algorithms.
Keywords/Search Tags:K-nearest Neighbor, Fuzzy Clusters, Threshold, Rare Data, Architecture
PDF Full Text Request
Related items