Font Size: a A A

Research On The Copy Detection Technology For Source Code

Posted on:2009-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:A P DengFull Text:PDF
GTID:2178360242990809Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Programming course is an essential part of computer education in College. It is necessary to have a lot of programming practice to improve students'programming ability. To ensure the quality of practical exercise, some measures to copy detection for source code is necessary.Copy detection for program source code is to judge whether the given code plagiarize contents of other program codes, which plagiarism occurs in some way, such as by duplicating partial or total code, by using different words or sentences to express the same meaning of the code of other programs.Firstly, this paper introduces the basic theories of the source code copy detection technology, and analyses the functions and characteristics of current copy detection systems for source code. The key technologies of the copy detection systems for source code are given.Secondly, this paper referred the thought of Karp-Rabin string matching algorithm and GST algorithm and presents the copy detection system for source code based on string's hash value matching method to solve the deficiencies of current copy detection systems for source code .The architecture of the system and basic theories of every module are given.Again, this paper describes the properties and many technologies of the copy detection system for source code based on string's hash value matching method. Lexical analyzer is designed to extract features of source code and generate a token string. Overlapping substring are selected to divide token string .The"rolling"hash function is adopted to compute hash values of the substring. A new measuring similarity method is presented.Finally, based on these researches, a prototype of the copy detection system for program source code based on string's hash value matching method is designed an implemented by object-oriented method. Accuracy of the results of cope detection for source code and the time complexity of algorithms are evaluated in the end.
Keywords/Search Tags:Copy Detection, Feature Extraction, Similarity, String Matching
PDF Full Text Request
Related items