Font Size: a A A

Study Of Distributed And Parallel Index

Posted on:2004-06-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:F YangFull Text:PDF
GTID:1118360095960094Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Along with the continuously extending of computer application field, the number of data become more and more large, and the search operation become more and more complicated. The distributed and parallel index gradually become the valid means of resolving this complicated problem because its high performance, and become the focus of Data Mining, Data Warehouse, Grid Computing and Ubiquitous Computing etc. This paper has analyzed the actual studying state of distributed and parallel index. Then presented a new distributed and parallel index frame and lucubrated the related index structure, index data assignment, index replication strategy, index data migrating and restructuring.In studying of index structure, we presented a new tree index structure that suitable for distributed and parallel and this tree was named DPB+-Tree. DPB+-Tree is based on B+-Tree and hash data structure. In DPB+-Tree, the leaf node is organized for a hash list and the nodes copies gradually decrease from boot to leaf. DPB+-Tree own good qualities of B+-Tree and hash data structure, and consider the performance of copies update, data migrating and load balance. Base on DPB+-Tree, we studied the index data assignment strategy and copies assignment strategy. The index data assignment use value range partitioning strategy and adjust its size by change the range. To obtain best load balance, the copies assignment is dynamic and according to system statistics data to trigger copies increasing or reducing, and perhaps copies migrating.In studying of index replication strategy, we considered the principle of copies duplicating that include the update/search ratio, machine load and reliability. Then discussed the process of copies building and update mechanism. The copies building can learn from an old one and the copies update base on message. In addition, according to the multi-copies characteristic of DPB+-Tree, we adopted dynamic fuzzy scheduling mechanism to improve load balance and response characteristic.Then studied index data migrating and restructuring. According to thecharacteristic of DPB+-Tree, we presented a kind of restructuring strategy with little cost in which only two layers of DPB+-Tree are changed. We also presented a kind of data migrating strategy using threshold. According to set two threshold, we can judge whether the load coefficient exceed limit and whether there is another node can receive data. If this is true, then one data migration process is triggered and the process is executed by node migration. Finally, to evaluate the performance of DPB+-Tree, we have conducted extensive simulation studies for response time, throughput, resource utilization and load balance. The simulation results demonstrate: DPB+-Tree not only highly improves search efficiency, but also resolves the update problem of distributed and parallel index system. It is better than CPB strategy.
Keywords/Search Tags:Distributed and Parallel Index, DPB~+-Tree, Index Data Assignment, Index Replication Strategy, Index Data Migrating and Restructuring
PDF Full Text Request
Related items