Font Size: a A A

Research On The Prototype System For Plagiarism Detection In High School Tibetan Composition

Posted on:2021-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:X YuFull Text:PDF
GTID:2437330602498653Subject:The modern education technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,it is common for students to copy network resources.In the field of higher education,there have been some studies on plagiarism detection of student thesis,but in low-resource language areas such as Tibetan,there are still a lot of research gaps in the detection of plagiarism detection and cross-language composition plagiarism.High school students are in the critical period of life learning and growth.At this time,the correct view of learning has an important impact on the future development of students and the cultivation of personal ability.The composition part of the Chinese subject is one of the most difficult points of learning.In face of writing difficulties,many students will choose massive resources on the Internet as a reference,and many of them plagiarize directly or indirectly.The plagiarism is not conducive to the development of students' writing ability,instead,it increases the workload of teachers' composition assessment.And also affects the creation of a fair environment for learning.Most of the existing plagiarism detection systems are suitable for plagiarism detection of papers,which are detected based on article s,references,and paper structure.The student's composition is different from the thesis.First,it does not have the "Abstract-Text-References-Acknowledgements" article structure of the dissertation.Second,the student's composition is more oral and lyrical than the dissertation.This method cannot be regarded as plagiarism in actual teaching.But students' plagiarism detection and thesis plagiarism detection are essentially text similarity detection processes.Therefore,students'composition plagiarism detection and thesis plagiarism detection are similar,but there are differences.Based on the existing research on plagiarism detection,this paper conducts plagiarism detection research on student writings.This article studies the plagiarism of Tibetan writings in high school.The types of plagiarism can be roughly divided into three types:copy plagiarism,semantic rewrite plagiarism,and cross-language translation plagiarism.According to each type of plagiarism,a detection method was proposed,and a prototype system suitable for Tibetan high school composition plagiarism detection was constructed.The main research contents are as follows:Detection of copying plagiarism:Based on the longest common subsequence algorithm,this paper deals with continuous copying plagiarism in Tibetan high school composition text.After experiments,the longest common subsequence method at the sentence level can reach an accuracy of 92.7%.Aiming at semantic rewriting plagiarism:In this paper,a twin short-term memory network model based on the attention mechanism is used to train Tibetan syllable vectors as model inputs to train the Tibetan composition semantic rewriting plagiarism detection model.Experiments show that the Pearson correlation coefficient of the method used in this paper can reach 0.6010.This shows that the similarity value calculated by the algorithm and the result of manual annotation have reached a strong correlation degree,and the accuracy rate is high.Aiming at cross-language translation plagiarism:In this paper,a twin short-term memory network model based on the attention mechanism is used to train Tibetan-Chinese cross-language word vectors based on artificially constructed Tibetan-Chinese dictionaries.Experiments show that the Pearson correlation coefficient of the method used in this paper can reach 0.4780,which indicates that the model output value and the manually labeled value have a moderate correlation.Finally,this article combines the above research content to design and implement a prototype system for plagiarism detection of Tibetan composition in high school.The system is aimed at teachers and researchers who use Tibetan composition,and has realized the automatic plagiarism detection function.It can detect separately according to single language and cross-language,and provide similarity values of uploaded compositions and comparison of plagiarized sentences.The operation of the entire system is simple,clear,practical and efficient,and provides good technical support for students' daily plagiarism detection of Tibetan compositions.
Keywords/Search Tags:Composition Plagiarism Detection, Low-Resource Language, Text Similarity Calculation
PDF Full Text Request
Related items