Font Size: a A A

Research On SQL Code Similarity Detection Algorithm

Posted on:2021-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:W X GeFull Text:PDF
GTID:2518306032967789Subject:Engineering
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet has promoted the optimization and upgrading of the online teaching model,but it has also made the methods and means of plagiarism more diverse and convenient.Especially in the computer courses carried out by colleges and universities,the phenomenon of code plagiarism is endless,which seriously affects student performance evaluation and teacher teaching.quality.The currently widely used code plagiarism detection methods and systems do not involve SQL code plagiarism detection.To this end,this paper studies the SQL code similarity detection algorithm to detect whether there is plagiarism in the SQL code data submitted by students in the online learning website of the university's database course.The main work completed by the thesis are:1)Propose a new SQL code similarity detection algorithm RGS.Analyze the principles and characteristics of SimHash and RKR-GST code similarity detection algorithm,introduce attribute counting method to extract the characteristic attributes of SQL code,through weighting the three methods are combined to improve the accuracy of detection.2)A similarity detection algorithm for SQL code based on coding behavior is proposed.Starting with the coding behavior of the students' SQL language codes,observe whether the coding behavior of the SQL codes to be tested are consistent with the coding behavior of the SQL codes written by the students' history.Experiments show that the similarity detection algorithm based on coding behavior can detect the existence of plagiarism that is not detected by the RGS algorithm.3)Designed and implemented a SQL code similarity detection system for online learning of database courses,including system requirements analysis,design and implementation,and re-fused the plagiarism detection algorithm based on coding behavior and RGS algorithm,the accuracy rate of plagiarism detection reached 84.78%.
Keywords/Search Tags:SimHash, RKR-GST, Coding behavior, Similarity detection, System design
PDF Full Text Request
Related items