| During the past few years,due to the prevalence of decentralized applications(e.g.,cryptocurrency),blockchain technology has attracted a great deal of attention from both the industry and academia.In essence,the blockchain system can be regarded as a distributed storage system.Each node stores a copy of state data and ledger data,thus realizing the function of decentralized validation.Based on this storage system,the on-chain data’s query function has also been expanded,thereby providing new supports for the upper-level applications.However,the current blockchain storage mechanism has poor scalability,which is mainly manifested in the following three points.First,the expansion of state data leads to inefficient block validation,which causes a high block propagation delay and limits the increase in system throughput.Second,the expansion of ledger data leads to high storage overhead,which reduces the number of full nodes in the network and lowers the system’s decentralization.Last but not least,the requirements of verifiable queries in the Byzantine environment need the storage mechanism to be enhanced,which may further aggravate the expansion of state data and ledger data.Solving the above issues mainly involves optimizing the blockchain storage mechanism,specifically including the following three aspects.Aiming at the problem of inefficient block validation,we propose EBV,which reduces the state data in the node and improves the efficiency of block validation.In general,EBV replaces the original representation of state data(Unspent Transaction Output set,UTXO set)in the Bitcoin system with a bit-vector set,which significantly reduces state data’s memory requirement,thus promoting the storage scalability and validation efficiency.To meet the validation requirements of transactions,EBV is designed from two aspects: on the one hand,the bit-vector set stored in a node enables the unspent validation;on the other hand,the additional proof data in the transaction structure supports the existence validation.Experimental results demonstrate that EBV can significantly reduce state data’s memory requirement,which is about 7.2% of the original Bitcoin system.When the memory limits of the EBV-based blockchain system and Bitcoin system are set to be the same,EBV can reduce the delay of block validation to 6.5% of Bitcoin at most.To deal with high storage overhead in the blockchain nodes,a flexible storage model(i.e.,Jidar)is proposed,based on the strategy of associated storage.This model splits the block data and stores it in different nodes,effectively reducing each node’s storage overhead.Concretely speaking,Jidar requires a node to store only the block header and the relevant transactions in the block body,when it receives a new block.For most ordinary nodes that only participate in a small number of transactions,Jidar effectively reduces their storage overhead,thus promoting storage scalability.As for the legality verification of transactions,Jidar supports it from two aspects.First,the transaction initiator attaches the Merkel branch to the transaction as the input’s existence proof.Second,a Bloom filter structure is added in the block header as the input’s unspent proof.The experimental results show that compared with the original Bitcoin system,Jidar effectively reduces the ledger storage overhead while introducing a small communication overhead.For nodes only participating in few transactions,Jidar can decrease storage overhead to 1.03% of the original at most.To tackle the requirements of verifiable queries in the blockchain system,we propose LVQ,a lightweight verifiable query mechanism.In general,LVQ is designed based on two variants of Merkle tree,namely Sorted Merkle Tree(SMT)and Bloom filter integrated Merkle Tree(BMT).Both SMT and BMT can be utilized to generate the verifiable data with low network overhead in the query process.Besides,since only root hashes of SMT and BMT are stored in the block header,the light node’s storage overhead can be kept low,and the scalability of the storage mechanism can be kept high.The experimental results show that,compared with the plain verifiable query scheme,LVQ brings only 1.4% of network overhead while introducing negligible storage overhead.In summary,this research tries to optimize the blockchain system from the aspect of the storage layer.It proposes an efficient data validation method,a flexible ledger storage model,and a lightweight verifiable query mechanism,which effectively improves storage scalability.The highly scalable storage mechanism can support the high performance of the blockchain system and protect the system security.It is foreseeable that this research can further accelerate the practice of blockchain technology and broaden the application scenarios of blockchain systems. |