Font Size: a A A

The Maxima-finding Algorithm Of Massive Data And Its Relative Application Research

Posted on:2010-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:X Q GuiFull Text:PDF
GTID:2178360275980495Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data sets in large applications are often too massive to fit completely inside the computer's internal memory. The resulting input/output communication (or I/O) between fast internal memory and slower external memory (such as disks) can be a major performance bottleneck. Especially in those fields as the following: massive geometric data in spatial database, geographical information system (GIS), logical constraints, relational database, science of statistics, virtual reality system, computer graphics. Therefore, the field of design and analysis of external memory (or EM) algorithms and data structures has appeared, where the goal is to exploit locality in order to reduce the I/O costs.Maxima-finding problems are the fundamental problem in computational geometry. It is thought that maxima offers many significant characteristics at the boundary of a point set, and has close connection with many problems such as Convex Hull and Nearest Neighbor. Moreover maxima-finding problems have appeared a great deal of application in many areas recently.The existed algorithms in the field of Maxima-finding problem have been researched in this paper. In this foundation, a new maxima-finding probabilistic algorithm dealing with massive data has been presented. The corresponding reliability has been validated from experiments and theory. The application of the algorithm in the field of database query has also investigated.Primary researches and creations in this paper are as follow.(1) For the maxima-finding problem, there has no linear I/O external memory algorithm yet. So a linear I/O maxima-finding algorithm dealing with massive data has been presented in this paper, which is based on the probability nature of maxima number, and in connection with the independent identically distributed data set. The corresponding reliability has been validated from experiments and theory too.(2) With the rapidly development of information technology and it's deeply applications, the method of data collections is becoming more and more abundant now, and it is more and more common that massive data in databases. In this paper, there is an application of the maxima-finding probabilistic algorithm in the field of Skyline query in database, with a nice experimental performance of this application.
Keywords/Search Tags:external memory algorithm, I/O, maxima, Skyline query
PDF Full Text Request
Related items