| Data has enormous potential economic value as the "oil" of today’s era.In order to promote data flow and give full play to the value of data,many data markets have emerged in recent years that facilitate data trading by matching needs and data sources.At present,most of the research on data trading is based on Blockchain to realize that buyers and sellers can complete trading independently without the participation of a trusted third party.In order to ensure the interests of data buyers,part of the ciphertext is randomly selected before trading,and the validity of the data and decryption key is verified after decryption.Therefore,in order to provide data validity verification and key validity verification without exposing the data content,this thesis proposes a fair data trading scheme based on the Blockchain trading mode of onchain trading and off-chain storage.The main work of the thesis includes:(1)A data quality assessment mechanism based on data embedding is proposed.Based on the Bert pre-training model,the data plaintext is embedded into a data vector,and the data analysis can be completed through similarity comparison while hiding the data content.The data set category matching degree and repetition rate are used as two indicators for data quality evaluation.Firstly,the maximum and minimum values of cosine similarity between data vectors of the same category are calculated based on the public standard data set as the standard threshold for evaluation;for the category matching index,we embed the sample data for display into a data vector,and then calculate the cosine similarity between the sample data vector and the data vector for sale,finaly judge whether the data set meets the category requirements based on the minimum similarity threshold;for the repetition rate index,we calculate the cosine similarity between the data vectors for sale in pairs,and then judge the repetition rate of data entries in the data set for sale based on the maximum similarity threshold.Finally,we decide whether to continue the trading according to the results of the data quality assessment.(2)A blockchain-based key sharing and validity verification mechanism is proposed.In order to complete the sharing of decryption keys in the open environment of the Blockchain and ensure that the process is traceable and non-repudiation,the improved BVES algorithm is used to disclose the secret factor parameters and conversion keys on the Blockchain,ensuring that only data buyers can recover the decryption key.In order to verify the validity of the key without exposing the content of the data,the Cipher Block Chaining Mode of AES is used to encrypt the data;the cloud storage platform shares a random ciphertext block to the data buyer for key validation,and the decryption of this ciphertext block obtains the heterogeneous value of the current plaintext block and the previous ciphertext block,and the current plaintext cannot be obtained without the previous ciphertext block,with no harm to the interests of the data owner.Based on the above-mentioned heterogeneous values,smart contracts are used to ensure that key verification and currency payment are completed at the same time to achieve fair trading.Finally,the fairness analysis and security analysis of the scheme are carried out.Based on the security analysis results,we further propose a user request processing strategy for the cloud storage platform to enhance the ability of the scheme to resist complicity attacks by limiting the number of concurrent transaction requests for the same data.(3)Based on the above scheme,a fair data trading system based on Fabric Blockchain is designed and implemented.The functional test and performance test of the prototype system are carried out.The test results show that the proposed scheme can complete the data trading between buyers and sellers on the basis of ensuring fairness,and can resist collusion attacks to a certain extent. |