With the rapid development of Internet storage technology, P2 P distributed storage systems with high extensibility and stability gradually become a hot research topic.Although a lot of P2 P systems existing in the market now, there still are some problems, mainly manifested in the following respects:(1) Replica redundancy technology or erasure code redundancy technology are used in Many storage systems, which has many advantages and disadvantages;(2) The same redundancy ratio of all files are used in most storage systems, with no distinction between hot file and the normal file. A lot of storage space is wasted;(3) The creation time of files is beyond consideration;(4) The system with erasure code redundancy mechanism do not consider the fact that the data block is not evenly distributed to the storage nodes.This paper mainly revolves around the storage efficiency of the P2 P storage system, including two aspects, one is that a new dynamic hybrid redundancy management mechanism is presented, the other is that file redundancy in P2 P storage systems is analyzed based on erasure code influence on file retrieval time, and a prediction mathematical model on file retrieval time is put forward.The main works of this paper are as follows:(1) P2 P distributed storage system storage efficiency optimization and the research background and meanings of P2 P storage system at home and abroad are introduced in the paper.(2) The replica redundancy technology and erasure code redundancy technology are deeply studied in the paper, and some important aspects of these technologies are compared, such as redundancy and fault tolerance.(3) A new dynamic hybrid redundancy management mechanism is presented based on the visits and create time of files, which is based on a replica redundancy technology and erasure code redundancy technology. And at the same time considering the file creation time and degree of access frequency, the files are divided into hot and cold file. Different storage mechanism is proposed according to this category, which is also considering the network load balancing problem. Some simulation experiments are made to verify the advantages of this mechanism.The two different distribution strategies of data blocks in P2 P storage system are analyzed based on erasure code. Uniform and non-uniform distribution strategy are proposed for the calculation on shortest time of the target file. Further, we also get the target file time estimation distribution function of the uniform distribution strategy. Due to the complexity of the non-uniform strategy, the frame of distribution function is just derived. Finally, the experiment show that it is possible to reduce data redundancy significantly without excessively compromising data availability by lengthening slightly the object retrieval time. |