Font Size: a A A

AST-based Multi-language Plagiarism Detection Method

Posted on:2013-10-31Degree:MasterType:Thesis
Country:ChinaCandidate:C L LiuFull Text:PDF
GTID:2248330395466492Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Development of the Internet, but also bring us more information aboutplagiarism is more common in the programming courses. The program designis an integral part in the institutions of higher learning computer professionalteaching practice and teaching is also a variety of programming languagescategory. Operating the form of electronic document submitted to the teacheris a common feature of the programming courses. Students to complete thejob downloaded from the Internet or the phenomenon of code to copy otherstudents intensified. Suppression code plagiarism phenomenon, more andmore important to improve the quality of teaching programming courses. Thisrequires accurate, efficient code plagiarism detection. Therefore, onplagiarism identification technology and its applied research has importantimplications.Mainly from the perspective of computer-aided teaching, student workand test the common means of copying is proposed based on the AST(abstract syntax tree) can be used for a variety of programming language codeplagiarism detection. First use a parser based on different programminglanguage grammar file to generate the lexical, syntax analyzer; then thecorresponding source code into the form of abstract syntax tree throughsyntax, lexer, and generated abstract syntax tree to In addition to theredundant information optimization processing; the next use of the sequencethat matches the code similarity calculation; the final extraction of thecharacteristics of similar components to generate a feature vector generatedvector clustering analysis to find the“copy cluster”. The experimental results show that the proposed AST-basedmulti-language code plagiarism detection method. Changed the most similarstudies focus only on the similarity calculation, lack of plagiarism gangsanalysis of the status quo. Further improve the study. As long as some kind ofhigh-level language grammar file, you can parse the appropriate languagecode, the implementation language plagiarism detection, so that the methodhas good versatility, and scalability.
Keywords/Search Tags:Plagiarism detection, AST, Multi-language, sequencealignment, cluster
PDF Full Text Request
Related items