Font Size: a A A

Research And Realization Of Metadata Prefetching Based On Data Mining Technology

Posted on:2009-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:X Z TangFull Text:PDF
GTID:2178360272475120Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the large-scale file storage system, the optimization of metadata accessing has very important impact on the improvement of the overall performance of file system. To optimize the performance of metadata accessing, it is particularly important to establish an effective and correct model based on the metadata prefetching.However, the existing cache and prefetching algorithms are mostly designed to access file data. The characteristics of metadata accessing and the smallness of metadata are not taken into account. Applying the cache and prefetching algorithm of file data to metadata prefetching is not pertinent, which may lead to inefficiencies of access. Based on this situation, this thesis presents a model using metadata access sequence in the log file to prefetch groups of users'future metadata operations, and we design a new cache and prefetching algorithm that embodies the feature of metadata. Through the analysis of n-gram prediction model, we consider the importance of the long-distance information among the metadata. In order to achieve groups of metadata prefetching, we introduce data mining technology to support n-gram model. By assessing different parameter n of model, this thesis chooses 3-gram prediction model. It employs the integration method of 3-gram prediction model and data mining technology to improve hit rate of metadata in cache and reduce the average response time of access requests, thereby it enhances the efficiency of metadata accessing.Trace-driven simulations show that for metadata access sequence of different users, the hit rate of our new metadata prefetching scheme can be increased by 3.9% and 16% in average compared with NEXUS and LRU. Nevertheless, space complexity of new metadata prefetching is relatively high, it is not beneficial to apply in the real file system. In order to apply this metadata prefetching algorithm to file system, we realize online metadata prefetching algorithm. It supports to mine the incremental closed frequent itemsets and discards the middle mining information. To a large extent it reduces the space complexity of mining. The memory space occupied by this algorithm is significantly lower than that occupied by new metadata prefetching algorithm, only accounting for 24% of the latter one. In a word, the time consumed by online metadata prefetching algorithm comparatively shortens. Therefore, online metadata prefetching algorithm accelerates the speed of algorithm effectively and it advances the shortcoming of new metadata prefetching algorithm occupied too much memory. At the same time, it provides great possibility and feasibility to apply this metadata prefetching algorithm in the real file system successfully.
Keywords/Search Tags:metadata, n-gram model, data mining, prefetch groups
PDF Full Text Request
Related items