Font Size: a A A

"Digital Party School" Information Distributed Data Mining Studing Under Campus Grid Environment

Posted on:2009-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:S Q WuFull Text:PDF
GTID:2178360245957640Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology, there is an urgent need to achieve the effective sharing of various network resources for computing resources, data resources, information resources, equipment resources and so on, it's not only easy for the user, but also do not have to consider the type of computer hardware, the location of the computer, the types of computer operating system which you install, and achieving a number of asynchronous computers working together which to improve the throughput of the entire network. In this case, grid technology came into being. "Digital party school" grid platform takes grid technology as the entry point to achieve the elimination of information islands and realize the full sharing of information resources.But as the information system which is based on databases, data warehouse, and other data storage technology applied in all walks of life, so that the massive data generated. How can we manage and organize the "digital party school" grid data effectively, and extract the knowledge which we are interested in——It requires web data mining. This paper achieved the effective integration of data through the use of these two tools grid and web mining. So this paper focused on two aspects: First, realizing the grid system of the "digital party school", gaining grid platform which is suitable for the web mining, second, designing a web data mining algorithm which is suitable for the grid platform of the "digital party school".Therefore, this paper first introduced the background of the grid, the status quo of web data mining, the basic idea of the grid model and common methods of web data mining.The scheduling of Resources is the key components whether the grid is efficient use of resources. This paper achieved the building of the web mining model through the grid model of the scheduling of resources. According to the characteristics of grid resources for the "digital Party School", this paper adopted the distributed data mining methods which is based on web services, at the same time drawing four classic decision tree algorithms ID3 algorithm, C4.5 algorithm, CHAID algorithm and the CART algorithm, and on the basis proposed a decision tree algorithm - merger and pruning decision tree algorithm which is suitable for the "digital party school" grid platform environment, and use it to pruning and merge its original decision tree, we can not only expand the coverage of the decision tree of knowledge, enhance the forecast precision of their unknown knowledge ,but also have fewer nodes compared with the original decision tree, reducing the complexity of the decision tree.Finally, this paper presented the summary and conclusions, and outlined the future direction for further study.
Keywords/Search Tags:"digital party school" grid platform, grid model, web mining, ID3 algorithm, C4.5 algorithm, CHAID algorithm, CART algorithm, merger pruning and decision tree algorithm
PDF Full Text Request
Related items