Font Size: a A A

A Study Of How To Judge Plagiarism In The Academic Theses Based On The Technology Of Text Mining

Posted on:2010-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:J J ZhaoFull Text:PDF
GTID:2178360275977915Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The incidents about academic corruption mainly have referred to academic misconducts, especially plagiarism in academic works and theses since the academia protested against academic corruption in 1990s. Solutions to judging plagiarism in the academic theses can not only have the important significance in protecting intellectual property, improving academic theses' quality, purifying academic fields and preventing academic corruption, but also avoid delivering the same manuscript to different magazines and lighten article editors' work load.Text mining is the extension from data mining to unstructured and semi-structured text data. Most of the information in our daily life is presented in the form of text mode. Text mining is an unusual process and it can discover some effective, innovative, useful and understandable patterns, models, trends from the text. Text mining can get the useful information and knowledge by making use of smart algorithm, combining with word-processing technology, analyzing a large number of unstructured text sources (such as texts, excels, e-mails, books, web pages, etc.), taking out or marking the relation between words , and classifying files according to the contents.The aim of this thesis is about how to apply the technology of text mining to the of plagiarism in the academic theses. The major research tasks can be listed as follows:(1) to analyze and sum up the types of plagiarism and the technology of judgment (such as digital print and word-frequency statistics ) based on the law;(2) to do a research on information retrieval, information extraction, and the main methods about text mining (correlation analysis, text categorization, text clustering, automatic digest, etc. );(3) to carry out the technology of calculation based on similarity in text and word-frequency statistics and produce a sound effect;(4) to work out Theses Plagiarism Judgment System based on text categorization according to the technology of calculation based on similarity in whole text, paragraphs and sentences.
Keywords/Search Tags:text mining, plagiarism judgment, text categorization, similarity in text, word-frequency statistics
PDF Full Text Request
Related items