Font Size: a A A

The Design And Implementation Of Distributed File System In Crowd Intelligence Sensation Environment

Posted on:2014-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:H J ZhanFull Text:PDF
GTID:2268330425475869Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The development of sensor technology and the popularity of smart phones have madecrowd intelligence sensation become the central part of mobile computing. The data thatcrowd intelligence sensation produces is huge and various, this has put forward higherrequirements on storing system. The data size that traditional file system can store is limited,the throughput traditional file system can offer is also limited, thus traditional file system isnot suitable for crowd intelligence sensation. Most of the existing distributed file systems usecentralized metadata management strategy, this is not suitable for the huge metadata thatproduced by huge small files under crowd intelligence sensation environment. GFS andHDFS are dedicated to storing big files, TFS and Haystack are suitable for storing small files,but distributed file system under crowd intelligence sensation environment should supportstoring big files and small files effectively.The storing of huge data is the basis of crowd intelligence sensation. This articleresearches the principle of main distributed file systems, and emphatically researches how todistribute metadata over several metadata servers and the strategies of storing big files andsmall files. After doing these things, This article describes the design and implementation ofthe distributed file system HaipengFS which has distributed metadata servers and supportstoring big file and small file at the same time. The main research content of this articleincludes:1. Research the main distributed file systems GFS, HDFS, TFS, ceph, their systemarchitecture, implementation principles and limitations;2. Aim at the requirement of storing a huge number of file metadata, research the mainstrategy of distributing metadata over several metadata servers, then design the strategy ofdistributing metadata which is based on consistent hashing for HaipengFS in reference ofcurrent strategies.3. Aim at the requirement of storing big file and small file at the same time, design thestrategy of storing big file and small file at the same time for HaipengFS in reference ofcurrent strategies of storing big file and small file, big file use several data blocks to store data,several small files share a data block and use a index file to store the offset of the small indata block.4. To make the file system can adjust to changing workload, design a load balancealgorithm which is based on genetic algorithm for the metadata servers of HaipengFS, thenuse experiments to verify the algorithm; 5. Based on the above research points, design the distributed file system HaipengFS, thenuse C++programming language implement the prototype system under Linux operatingsystem and test the performance of file read write performance on aliyun.
Keywords/Search Tags:distributed file system, metadata, big file, small file, load balance
PDF Full Text Request
Related items