Font Size: a A A

Research On Drawing Characteristic Bunch Technology For Copying Plagiarism In Program Code

Posted on:2010-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:M HouFull Text:PDF
GTID:2178360278451515Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Technique about detecting copying gets very extensive employed in the information age, especially in computer programming. The detecting copying is divided into two types: one is detecting formalized language text (for instance: computer program code, etc.), another is detecting natural language texts. The detecting for duplicating program codes is to judge whether a procedure is plagiarized or duplicated from other one or more procedures. The key technology is to calculate the similar degree of code.When calculate similar degree, we need draw program's characteristic value (represent the basic language units of content of this procedure and structure) at first, and then compare them, judge the similar intensity among the program code according to the comparative result, namely calculate similar degree. In this course, the abstraction of the characteristic value is essential. The quality of the characteristic value influences the accuracy of the comparative result directly. This paper mainly studies the technology about drawing characteristic value.The paper introduces the detection techniques about similar degree of program codes at first, including definition of similar degree of the procedure code and detection technique classification, the current study situation and some applied detecting systems. The paper also makes an introduction about the technology to draw the characteristic value in detecting similar degree of program codes.The existing systems adopt mostly methods based on string comparison. They segment program codes and get the characteristic string at first, then carry on similar degree comparison to the string. Such a string includes few structural information of the procedure, which will influence the accuracy of the comparative result. The paper mainly researches the method of changing the program codes into the structure bunch including more structural information, and to offer better basis for next step.The method of producing the string including structure information designed in this research is mainly divided into three steps to finish: the first step, making grammar and morphology rules of the corresponding language, which offers basis for the subsequent analysis and conversion on the procedure source code; The second step, producing morphology analyzer and grammar analyzer; The third step, using morphology analyzer and grammar analyzer to analyze the procedure source code, produce the corresponding tree-like structure bunch.Finally, the paper passes embodiment test. The research realized the function that changes the procedure source code into the tree-like structure bunch, have achieved the anticipated research purpose.
Keywords/Search Tags:detecting duplicate for program code, similar degree, characteristic value, tree-like structure bunch
PDF Full Text Request
Related items