With the rapid development of cloud computing technology, the cloud storage applications are increasingly widespread. At the same time, more and more companies and individuals choose to store the data on cloud storage systems. In the widely accepted by users using, data integrity verification and data deduplication technology of cloud storage has been the focus of attention in the academic and business communities. Integrity of the data cloud storage service for users is the basic guarantee. In order to save storage space and increase data access efficiency, cloud storage service providers urgently to find a data deduplication technology solutions. In this paper, we take cloud storage data integrity verification and data deduplication technology of cloud storage system as the goal, cloud storage data integrity verification methods and data deduplication technology of cloud storage has been studied in detail and in-depth exploration.To ensure the integrity of user data in cloud storage system is a basic requirement, in order to provide users with security of data storage and integrity, its need to supply data integrity verification in the transmission and using data. In this paper, a distributed storage system HDFS as an research example, presented data integrity verification method RSA-based combined with authentication techniques of state. The data integrity verification methods have less time complexity, support for dynamic change data integrity verification, support for public verification and validation of data is prevented from leaking. Then, the paper given the proof of robustness of the feasibility of the verify data integrity.As the amount of data stored in the cloud storage grows, its need more storage space to deal with the growing amount of data. As the amount of cloud storage system data growth, more long time required to process the data. Therefore, the data in the cloud storage deduplication is a very necessary means. The deduplication of data stored in the cloud storage, not only can improve the utilization of cloud storage space, and can improve the efficiency of data access. In this paper, according to the characteristics of data distributed in cloud storage, data from the perspective of the fingerprint index optimization study, we propose a similarity-based block-level deduplication methods. This method uses data deduplication feature fingerprint and data blocks fingerprint to build two index fingerprints, which can greatly improve the efficiency of index fingerprint. The data deduplication methods can be used for block deduplication between multiple users of the same type file, which greatly improve the utilization and efficiency of the data processing storage space. Finally, from the data deduplication rates, system throughput and memory usage to experimental, results show that the proposed data deduplication method showed excellent performance. |