Font Size: a A A

Research On Text Authorship Identification Technology And Its Application In Network Tracking

Posted on:2022-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:F X HuFull Text:PDF
GTID:2518306524990379Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,a large number of anonymous texts have emerged on the Internet.Anonymous texts are usually flooded with false information,fraudulent information,and even rumors that endanger national security.Especially,the dark web has become an ideal place for criminals to commit crimes due to its inherent invisibility.The text authorship identification technology can better find and track the author of the online text,thereby combating and preventing cybercrimes,and maintaining the health and safety of the online environment.The existing text authorship identification technology is aimed at the authorship identification of online texts,and its accuracy and reliability are low,and the manual participation in the text feature screening process is high.Therefore,this thesis employs deep learning and other advanced natural language processing techniques to conduct anonymous text authorship identification based on web text.The thesis includes three aspects of research as follows:(1)As for the fact that poor authorship identification results from web text,this thesis proposes an authorship identification model based on deep learning.The model captures text features via the Transformer-Encoder model,which could not only learn more diverse text characteristics but also improve the parallel efficiency of the model,thus,accelerating the network training process.Besides,to overcome the lack of text features problem caused by pooling operation in convolution network,this model introduces a capsule network that aggregates text features through a dynamic routing mechanism,retaining the text features to the greatest extent.Compared with other author recognition models,this model promotes a slight increase in the accuracy,precision,recall,and F1 indicators of the authorship identification task.(2)In regard to the demand for web text author tracking,this thesis proposes a text authorship verification model based on Siamese network.The model maps the input text content to the same high-latitude feature space through the weight sharing of the Siamese network,so as to calculate the similarity between features and achieve the purpose of text authorship verification.Besides,to obtain more comprehensive text features and improve the accuracy of text authorship verification,this model combines the deep features and global features of the text when acquiring text features,performing authorship verification via the Siamese network.Compared to other text feature extraction network models,this model has a significant improvement in the accuracy of authorship verification tasks.(3)Based on the authorship verification model proposed in this thesis,an online texts author tracking application system has been designed and implemented,which identifies and tracks internet anonymous texts from both time and space dimensions.The system is functioned with calculating the similarity of web text authors and looking for the distribution of the authors' trajectory and frequency of each geographical location they show up,and the system is able to show its various functions clearly and intuitively by using visualization technology,consequently,aiding users in operating and acquiring results,thus,achieving text authors tracking task efficiently.
Keywords/Search Tags:Authorship identification, Capsule network, Transformer, Siamese network
PDF Full Text Request
Related items