Font Size: a A A

Research On Author Recognition Technology Based On Writing Style Crack Discovery

Posted on:2020-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:T LiFull Text:PDF
GTID:2428330575968800Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In order to plagiarize the knowledge of others and avoid academic misconduct,plagiarists often do not copy the entire paragraph,and the plagiarized content will also be processed.Based on this background,this paper makes the original judgment of the article from the perspective of writing style.Writing style is the habit of the author's long-term writing and will not change in the short term.The study of the author's writing style can explore the author's writing habits to determine the originality of the article.After understanding and summarizing the current situation and methods in related fields at home and abroad,this paper designs a set of original detection methods.The article may not be completed by one person,so this paper first proposed the concept of “style crack”.The style crack indicates the position where the writing style changes,and the text segmentation is performed according to the style crack.The recognition of style cracks is achieved by style feature extraction.Considering the style features from words,sentences and emotions,this paper designs 7 features for style crack recognition,finds the location of the style crack based on the results of the style feature extraction combined with the clustering algorithm and performs text segmentation according to style cracks.THE author recognition is performed on each part based on the recognition of style cracks.This paper constructs the word level author recognition framework(ARTW)for author recognition.The framework uses the GloVe word vector as the underlying word vector to support the embedding of Bi-GRU into the twin neural network for feature representation of the text.Bi-GRU was used for high-order feature extraction,and twin neural networks were used for similarity calculation.In the training process,the attention mechanism is added to the virtual words to make the network converge toward the style direction.After the Bi-GRU feature extraction,this paper proposes segmentation pooling and MLP hidden layer to extract high-order text features,and presents a joint loss function in the progress of high-order feature extraction and similarity calculation to improve the robustness of the framework.In this paper,the style crack recognition and author recognition are carried out by experimenting with different corpora and mixed thematic corpus,also,the validity and value of style crack recognition and word-level author recognition framework(ARTW)are verified by comparing traditional algorithms.
Keywords/Search Tags:style crack, style feature, author identification, Bi-GRU
PDF Full Text Request
Related items