Font Size: a A A

Automated Vulnerability Detection Of Ethereum Smart Contracts

Posted on:2021-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:J J SongFull Text:PDF
GTID:2428330614471736Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
In recent years,as an emerging technology,blockchain has attracted extensive attention from domestic and foreign scholars and aroused their research enthusiasm.Blockchain can realize remote-distance point-to-point value transmission without trusted third parties by integrating encryption algorithm,distributed data storage,consensus mechanism,and other technologies.Through the implementation of smart contracts,blockchain technology has greatly extended its applicable business scenarios.The application scenarios of blockchain are expanded from the initial financial domain to real domains such as Internet of Things,healthcare,energy,and smart manufacturing.Since the languages of smart contracts are not yet mature and the qualities of developers are uneven,there are inevitable vulnerabilities in smart contracts.Recently,the exploding security events about smart contracts have led to huge pecuniary losses and severely reduced people's trust in blockchain smart contracts.At present,the main methods of vulnerability detection in smart contracts are formal verification,symbol execution or symbol analysis,and fuzz testing.However,these methods have many disadvantages: the disadvantage of formal verification is that it cannot be fully automated;symbol execution or symbol analysis often needs to explore all executable paths in the contracts or symbolically analyzing the dependency graph in the contracts,so the time overhead is large and the execution efficiency is low,which is not suitable for large-volume vulnerability detection in smart contracts;the test samples generated by the fuzz testing method have strong randomness,resulting in low code coverage,so fuzz testing often fail to effectively detect all vulnerabilities in smart contract codes,and also have the disadvantage of long detection cycle.Faced with the increasing number of smart contracts,existing methods are overwhelmed.In order to ensure the accuracy of vulnerability detection in smart contracts and improve the efficiency of detection,this thesis proposes an automated vulnerability detection method in smart contracts based on machine learning algorithms.By extracting the effective features of smart contracts and training multi-label classifiers,six kinds of vulnerabilities in smart contracts including integer overflow vulnerability,integer underflow vulnerability,transaction-ordering dependence,callstack depth attack vulnerability,timestamp dependency vulnerability and reentrancy vulnerability,are detected accurately and efficiently.It has been proved by a lot of experiments that the method in this thesis is more suitable for large-scale vulnerability detection application scenarios in smart contracts.In order to obtain enough smart contract data to carry out research,this thesis designs and implements a web crawler tool that automatically crawls smart contract data in large batches.We use multi-thread optimization in this tool to greatly increase the speed of data collection.After a period of crawling,a large amount of Ethereum smart contract data has been collected,including source codes of the contracts,Solidity versions,token names,contracts addresses and other useful information,so a comprehensive and up-to-date smart contract data set has been constructed.By studying the relationship among the source codes of the contracts,bytecodes and operation codes of Ethereum Virtual Machine,and deeply analyzing the internal relationship between the operation codes and the vulnerabilities in smart contracts,this thesis propose the abstract rules of the operation codes.This thesis also propose to extract bigram features from the operation codes data stream through the n-gram algorithm,and calculate the frequency feature values corresponding to the bigram features by defined feature calculation formula,and construct the feature matrix.We make a lot of experiments based on the constructed feature matrix using XGBoost,Ada Boost,Random Forest,SVM and k NN classification algorithms,combined with SMOTE or SMOTETomek data balancing methods.Results show that the XGBoost multi-label classification model based on the SMOTETomek balanced training set has the best detection effect on smart contract vulnerabilities,and the evaluation indicators micro-F1 and macro-F1 values are as high as 98.48% and 96.41% respectively.And our method greatly improves the detection speed,the average detection time is about 4 seconds per contract.Based on the above experimental results,this thesis implements a B/S architecture vulnerability detection system of smart contracts,in which users can upload source codes or bytecodes of smart contracts,then the system performs online data processing.Then the system detects vulnerabilities of the smart contracts through the trained machine learning model,and displays the detection results.
Keywords/Search Tags:Smart Contracts, Operation Codes Characteristics, Vulnerability Detection, Machine Learning, Blockchain
PDF Full Text Request
Related items