Research And Realization Of Metadata Prefetching Based On Data Mining Technology

Posted on:2009-11-18

Degree:Master

Type:Thesis

Country:China

Candidate:X Z Tang

Full Text:PDF

GTID:2178360272475120

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

In the large-scale file storage system, the optimization of metadata accessing has very important impact on the improvement of the overall performance of file system. To optimize the performance of metadata accessing, it is particularly important to establish an effective and correct model based on the metadata prefetching.However, the existing cache and prefetching algorithms are mostly designed to access file data. The characteristics of metadata accessing and the smallness of metadata are not taken into account. Applying the cache and prefetching algorithm of file data to metadata prefetching is not pertinent, which may lead to inefficiencies of access. Based on this situation, this thesis presents a model using metadata access sequence in the log file to prefetch groups of users'future metadata operations, and we design a new cache and prefetching algorithm that embodies the feature of metadata. Through the analysis of n-gram prediction model, we consider the importance of the long-distance information among the metadata. In order to achieve groups of metadata prefetching, we introduce data mining technology to support n-gram model. By assessing different parameter n of model, this thesis chooses 3-gram prediction model. It employs the integration method of 3-gram prediction model and data mining technology to improve hit rate of metadata in cache and reduce the average response time of access requests, thereby it enhances the efficiency of metadata accessing.Trace-driven simulations show that for metadata access sequence of different users, the hit rate of our new metadata prefetching scheme can be increased by 3.9% and 16% in average compared with NEXUS and LRU. Nevertheless, space complexity of new metadata prefetching is relatively high, it is not beneficial to apply in the real file system. In order to apply this metadata prefetching algorithm to file system, we realize online metadata prefetching algorithm. It supports to mine the incremental closed frequent itemsets and discards the middle mining information. To a large extent it reduces the space complexity of mining. The memory space occupied by this algorithm is significantly lower than that occupied by new metadata prefetching algorithm, only accounting for 24% of the latter one. In a word, the time consumed by online metadata prefetching algorithm comparatively shortens. Therefore, online metadata prefetching algorithm accelerates the speed of algorithm effectively and it advances the shortcoming of new metadata prefetching algorithm occupied too much memory. At the same time, it provides great possibility and feasibility to apply this metadata prefetching algorithm in the real file system successfully.

Keywords/Search Tags:

metadata, n-gram model, data mining, prefetch groups

PDF Full Text Request

Related items

1	Research On The Application Of Data Mining Technology In Disease Related Groups
2	Research On Core Data Model Of Exploration And Development Based-on Metadata Method
3	The Management Of Metadata In OLAM (On Line Analytical Mining Processing)
4	Research, Implementation And Application Of Data Preprocessing Algorithms In Web Log Mining
5	Study And Implementation On Prefetch Technique At Proxy Based On Web Mining
6	Design And Implementation Of Data Prefetcher
7	Chinese Traditional Medicine Couples And Groups Mining Methods Research
8	Parallel Computation For File Metadata Cube In Cloud Computing Environments
9	Research On Data Model & Data Mining Of Optical Database
10	Ocean Core Data Mining Research, And Standards