Research And Implementation Of Code Similarity Detection Algorithm

Posted on:2017-09-24

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Feng

Full Text:PDF

GTID:2348330536453392

Subject:Engineering

Abstract/Summary:

Code similarity detection technology,which means using a similarity detection algorithm to calculate the similarity between codes,is an important approach to identify software copyright and to judge code plagiarism.In contrast to the traditional manual detection,code similarity detection technology can not only calculate code similarity and locate the plagiarism quickly and accurately,but can also efficiently resist some complicated plagiarism approach such as rename the variable or change the order of statements.This paper summarizes a variety of standards and technologies of code similarity detection,then respectively apply fingerprint generation algorithm and string matching algorithm to detect the code similarity detection.The main contributions are as follows:1.proposes a code similarity detection algorithm base on Smith-Waterman algorithm.The algorithm make improvements on Smith-Waterman algorithm to fit the circumstance of code similarity detection,including generate tokens of code,split the tokens by functions and define marking standards.2.proposes a code similarity detection algorithm based on Winnowing algorithm.Unlike text fingerprint generation,the algorithm generates fingerprint from tokens instead of text and then calculates code similarity by comparing fingerprints.3.the paper proposes the parallel scheme for these two code similarity detection algorithms based on shared memory model.For the algorithm based on Smith-Waterman,it is implemented in a data-parallel form.For the algorithm based on Winnowing,it is implemented in a task-parallel form.4.the paper tests and compares the two algorithms with JPlag in 3 experimental data sets.The results show that two algorithms can detect a variety of code plagiarism and has better performance in change order of statements or functions than JPlag.

Keywords/Search Tags:

Related items

1	GPU data-parallel computing of sequence alignment using CUDA
2	The Implementation And Analysis Of Smith&Waterman Algorithm On Systolic Array
3	Polymorphic Worm Features Automatic Extraction Of The Model And Algorithm
4	Automatic Traffic Signature Extraction Based On Smith-Waterman Algorithm For Traffic Classification
5	Reconfigurable Computing Research Oriented Bioinformatics
6	Code Clone Detection Based On Sequence Alignment And Byte Code
7	Code Clone Detection Based On Sequence Alignment And Deep Learning
8	Design And Implementation Of Code Clone Analysis System Based On Sequence Matching
9	A Similarity Evaluation Algorithm Of C Source Program Based On Code Fingerprint
10	Application Of GPU Parallel Techniques In Improved Genetic Algorithm And Molecular Similarity