Font Size: a A A

A Diversified Characteristics Extraction Approach For Similar Code Analysis

Posted on:2020-06-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2428330572973615Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computers and the rapid development of the Internet,software systems are applied to all aspects of daliy life.Due to the formal language features of program code,information plagiarism becomes more and more prevalent.Facing the situation of large code capacity and many iterations,it is almost impossible for manual detection to deal with it.Therefore,code similar detection technique has come into being.The technique analysis the program characteristics to determain the similarity between programs.Program characteristics are the basic unit that can represent the program content and structure.Therefore,the accuracy of the characteristics will directly impact the accuracy of the similarity detection results.This paper proposes a diversified characteristics extraction approach for similar code detection analysis.The approach considers several factors such as the statistical properties,structure,execution path,and data flow of the program.The approach extracts characteristics from three perspectives including Attribute Counting,Structure and Function.Moreover,it builds an open source program code characteristic database.First of all,this paper introduces the concepts of attribute counting characteristic,structure characteristic and function characteristic.Afterwards,it anaylses the module design of each characteristic in detail,and derives the correspiding extraction guideline.Finally,based on the Defect Testing System(DTS),we propose the algorithms for each characteristic extraction.In order to improve the efficiency of code similarity detection and uniformly store these characteristics,this paper proposes the related concept and the generation algorithm for characteristic code.In this paper,we implement the diversified characteristics extraction approach and integrate it in DTS.Via analyzing the experimental results of five open source programs,the accuracy rate of the characteristics extracted by this approach is 84%,which achieves the expected research objectives.Besides,this approach provides more accurate and technical support for code similar detection.
Keywords/Search Tags:Similar Code Analysis, Characteristics Extration, Function, Symbolic Model
PDF Full Text Request
Related items