With the rapid growth of the Internet, various forms of new applications have emerged in an endless stream. Especially the Web 2.0, representative of the new generation Internet applications, has brought great challenges to traditional data storage. In the past, Internet services were mostly based on Client/Server model, with the resources being stored in the centralized servers. With the sharp growth of the contents of storage, it's not possible to meet the massive Internet users'access requirements only by servers.Incorporating the contemporarily popular Peer-to-Peer (P2P) technique, based on dedicated storage servers and Internet users oriented massive data storage service has been developed. A few storage servers which are deployed in WAN, making up the reliable structured overlay networks, provide the Internet users with guaranteed resources locating service. The storage servers are responsible for piece replica's storage and distribution, and the redundancy provided by a few replicas which improve the system's availability, avoiding single point failure. By introducing the self-adaptive replica redundancy management algorithm, we can adjust the replicas' quantity intelligently and lead replicas migrate to the storage servers nearby the user, facilitating future mobile services. The P2P overlay networks which is composed by the user downloading the same content, providing upload bandwidth to other users, has relieved the servers' pressure. Even though massive users request the content, the system still can cope, with the guarantee of good scalability.For the storage server's high performance and high availability, network i/o is handled by non-block socket asynchronously, which is called Reactor framework based on Linux epoll mechanism. There are many timers, so efficient timer management is implemented in Reactor. Disk-io is processed by Linux aio, and efficient disk block cache management algorithm has been proposed. By udp socket programming, Chord locating algorithm has been implemented.This system is composed of Server (ChunkServer) and Client (PPDSS Client). ChunkServers are run on Linux platform, written in C++ language, and designed by OOP. If we can take exceptions in real networks into account, this system can be deployed in Internet as fundamental storage platform to provide reliable storage service to massive Internet users.To alleviate the storage servers' pressure and improve the system's throughput, piece amplification based direct neighboring nodes selection algorithm has been designed and implemented. By selecting the peers with good piece amplification, guiding them to connect storage servers directly, downloading pieces from storage server and quickly distributing the pieces by themselves, the download experience has been improved remarkable, especially during the time when the contents are just published. According to our modeling analysis and experiment verification, this algorithm has improved the content's distribution effectively and enhanced the system's throughput. |