Font Size: a A A

Efficient Plagiarism Detection Techniques And Systems On Semantics Of Academic Paper

Posted on:2011-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:F Y KangFull Text:PDF
GTID:2178360305497392Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
These years, with the rapid development of Internet technology and increasing resources of network databases, it becomes more convenient to access papers. Researchers can easily get academic papers they need, to help their work. However, it also facilitates copying papers. people only need to simply copy and paste other people's papers, which makes the content themselves. Plagiarism becomes more and more in academic area, causing strong concern in society.If we find an efficient technique or approach to detect plagiarism, we not only can effectively find copying of papers, but also can effectively stop the phenomenon of plagiarism. So the author and her group cooperated with Shanghai Early Bird Information Technology Ltd to develop a meta-search engine based plagiarism detection system. This paper analyzes the domestic and international document copy detection techniques based on the preliminary work, and optimize the system with semantics of thesis. Our main contributions are listed below:1. Summarize the domain ontology techniques.2. Design the architecture of ontology based paper plagiarism detection system, describe the system process and used algorithms.3. Discuss the general structure of academic papers, and semantic information contained in the structural content. Present an approach to build the pa-per structural ontology, and design structural ontology based pre-processing approach, efficiently pre-classify papers, and find the candidate sets.4. Propose relative unit density model, design the approaches to filter candidate sets, compute similarity and determine the plagiarism.5. Verify the effectiveness and efficiency of the ontology based plagiarism de-tection algorithm by typical test cases.
Keywords/Search Tags:Paper Semantics, Plagiarism Detection, Internet Information Extraction
PDF Full Text Request
Related items