Font Size: a A A

Research On Methods And Technology For Metadata Querying In Large-Scale File Systems

Posted on:2012-06-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:L K LiuFull Text:PDF
GTID:1118330362467996Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As modern file systems continue to expand on scale, file-system-wide metadataquerying is becoming increasingly important for administrators of large-scale storagesystems to monitor and better understand the overall make-up of their file system. Itprovidesinformationtoguidespacemanagement, charge-back, provisioningdecisions,migration among multiple storage tiers. Although there have been works on metadataquerying, many problems remain unsolved. In this thesis, we address two of the mostimportant challenges of storage management metadata querying: efficient metadatacrawling and synchronization, and I/O bottlenecks of long-scan based querying by em-ploying several novel technologies that exploit metadata and querying properties. Themain contribution of this dissertation is summarized as follows:(1) A file system metadata study from the perspective of metadata querying. Ourstudy analyzes two large-scale file system traces from enterprise-class file servers, andreveals several observations that can be used to improve file-system-wide metadataquerying, such as highly skewed change frequency distribution, temporary and spatiallocality, the impact of file types, etc.(2) SmartScan, a selective-scan based metadata crawling approach that exploitspatterns in metadata changes to significantly improve the efficiency of synchronizationbetween file systems and metadata database used for querying. Experiments with usingSmartScan on production file systems show that such approach can reduce the timeneeded to refresh the metadata database by1~2orders compared with full scan;Furthermore, such selective scan has minimal impact on freshness and on the resultsfor example metadata queries.(3) FastDu, a mechanism of tracking file system changes by intercepting the filesystem calls. And a directory summary collecting service is also proposed to demon-strate the usage of FastDu. Compared with existing directory summary collectingmethod based on just-in-time file system traversing, FastDu's solution shows2~3or- der of magnitude response time improvement with almost negligible application-awarefile system performance penalty.(4) An encode method for native file system properties, an on-disk layout of meta-data copy that exploits the spatial locality and highly skewed distribution of metadata,andalooseparalleldepthfirsttraversealgorithmthathelpstooverwhelmtheI/Obottle-necksof long scan basedmetadata querying. Experimentalresults showthat, comparedwith general purpose compression approach, this new approach provides better com-pression ratio and faster scanning performance.(5)Ametadataquerysystembasedonmaterializedviewandfastparallelmetadatascanning. Compared with the RDBMS-based solution, the new system can effectivelysupport the long-scan querying, and is more scalable.
Keywords/Search Tags:large-scale file system, manageability, metadata, querying, performance
PDF Full Text Request
Related items