Font Size: a A A

Research On Code Similarity Detection Technology

Posted on:2022-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y F XieFull Text:PDF
GTID:2518306338986709Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of computer technology,the opportunities for plagiarism of content and software have also increased.This kind of unauthorized reuse will not only make it difficult for academia to maintain high standards of academic research,but also expose companies to legal risks of unauthorized content in products.Besides,it will bring destruction to intellectual property rights.Code similarity detection is a widely used technology.From the perspective of schools and education,it can be used for plagiarism and duplication detection of student code assignments.From the perspective of enterprises and companies,it can be used for intellectual property protection and information retrieval.In recent years,there have been many research methods related to code similarity detection,these methods have achieved certain results and can detect different types of plagiarism.However,these methods have some shortcomings,such as considering less code syntax and semantic information,or having low detection efficiency.Therefore,how to detect similar codes accurately and efficiently is still a challenging problem.Based on the analysis and investigation of the latest research on code similarity detection technology at home and abroad,this paper focuses on code similarity detection technology.The main research work is as follows:1)Aiming at the problems in the study of code measurement representation methods,an improved and efficient detection method is proposed.This method combines two different structural representation methods,and improves the detection accuracy and algorithm execution time;2)Aiming at the shortcomings of the existing unsupervised clustering methods,an improved code similarity detection idea is proposed.This idea is based on the previous research work and introduces several unsupervised clustering algorithms with different characteristics,this idea has improved the clustering effect and execution time compared with previous studies;3)An improved similarity detection method based on graph neural network model is proposed,which combines techniques such as tree edit distance and bug signature in binary code.The indicators of the proposed method on multiple detection tasks are improved compared with the models in the previous research;...
Keywords/Search Tags:code similarity detection, code measurement method, unsupervised clustering method, graph neural network model
PDF Full Text Request
Related items