Font Size: a A A

Deep Learning-based Text Fuzzy Plagiarism And Plagiarism Detection Research

Posted on:2022-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:S H ZhouFull Text:PDF
GTID:2518306758467134Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,it is more and more convenient to produce materials and exchange information on the network,but at the same time,bad phenomena such as plagiarism and plagiarism have also occurred frequently and have become more and more serious,especially in academic production activities frequently exposed.Curbing the occurrence and detection of plagiarism and plagiarism has become one of the focuses of scholars' research.Text plagiarism detection is a research topic on whether the text contains plagiarism.It is mainly divided into two frameworks: external and internal plagiarism detection.Although current methods have excellent performance in copy-paste and synonymous substitution types of plagiarism,they are still unable to identify and detect complex paraphrased plagiarism sentences or paragraphs.This thesis deeply analyzes the current mainstream detection methods,aiming to solve the two major pain points of the current plagiarism detection.To address the problem that current text plagiarism detection methods identify fuzzy human simulated plagiarism content and complex paraphrase plagiarism with low accuracy,this thesis proposes a deep learning-based text fuzzy plagiarism detection method.The method utilizes siamese network to uniformly map matched text pairs into the same parameter matrix space for accurate semantic alignment and similarity measure,captures rich local text features and optimizes model training capability through convolutional and residual networks,and establishes text interaction patterns of different dimensions to enhance semantic similarity matching between text segments.On this basis,the prediction performance is further improved after the late fusion of text features.The experimental results demonstrate that excellent performance is achieved both on the specialized paraphrase problem dataset and on the plagiarism competition dataset.In order to address the limitations and shortcomings of text plagiarism detection methods in detecting plagiarism activities,this thesis proposes a cross-modal plagiarism detection method.Previous text plagiarism detections have mostly focused on the unimodal data of text elements,while the plagiarism feature factors outside of papers have rarely been studied and addressed.In this thesis,we analyze real user behavior data in academic platforms and use them as feature factors to obtain statistical information and distribution patterns of behavior data,and build a neural network to identify abnormal operation behaviors of users,so as to detect potential plagiarism activities.Secondly,the text fuzzy detection method proposed in this thesis combined with the introduction of multimodal feature data as a trade-off feature to achieve the identification and detection of plagiarism in a broader range.
Keywords/Search Tags:Text plagiarism, Fuzzy detection, Deep learning, Cross-modality
PDF Full Text Request
Related items