Font Size: a A A

Study On Data Management Technology Of Peer-to-Peer Distributed File System

Posted on:2021-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:X XiongFull Text:PDF
GTID:2518306107993589Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,the demands of users in online community for data exchange have increased dramatically,and the way of data sharing have also gradually diversified.The centralized file download server based on FTP or HTTP provides high-density information to users,but the independent server has high operating costs,and the personal decision-making behavior of community operator will bring uncertain risks to community sharing network.Commercial cloud storage provides additional options for data sharing with its cross-platform,high-quality file management.But the resources shared through the cloud storage are more scattered,resulting in low information density and increased user screening costs.P2 P shared network is an open storage solution.The decentralized storage architecture design avoids the single point of failure problem,but the decentralized design itself reduces the information density,and the redundant of multiple copies wastes storage space.Based on the requirements of P2 P data sharing technology for low-overhead data storage,this thesis proposes a low-overhead P2 P data management technology to solve the data placement conflict problem of erasure codes in decentralized architecture.First,a hierarchical P2 P network is designed based on the Kademlia DHT.On the basis,the isolated Kademlia space is generated by introducing logical storage domains.It provides the underlying support for resolving conflicts of block placement at the level of decentralized storage architecture.Second,based on erasure codes,a low-overhead data management technology is designed with a consistent block placement algorithm.The mapping conflicts of chunks are processed through the linear detection method to ensure the independent and consistent placement decision.Finally,in view of the characteristics that node addressing and network communication delay in decentralized architecture will magnify data contradiction,a low-coupling synchronization and consistent data repair technology is proposed,which enables P2 P nodes to organize data repair process spontaneously and orderly.This thesis conduct experiments based on a P2 P network composed of 24 Raspberry Pi nodes.With the same fault tolerance of 10-copies IPFS,the storage overhead of the method proposed with 4 + 4 encoding and 2 copies reduces the storage resource overhead by 59.95% and increases the storage utilization rate by 2.5 times.In terms of the transmission overhead of data repair,the method reduces the overhead by60.34% under one node fails.While the number of faulty nodes reaches 9,the repair cost is only 51.39% of IPFS.In terms of chunk access missing rate,the CPLD algorithm proposed has a missing rate of only 7.61% in a 6-bit storage domain and 12 encoding chunk configurations,and as the number of storage domains increases,the missing rate will further decrease.
Keywords/Search Tags:Distributed Storage, Peer-to-Peer, Distributed Hash Table, Metadata Management, File Sharing
PDF Full Text Request
Related items