Font Size: a A A

The Design And Implementation Of The Crime-related Text Analysis System Based On Deep Learning

Posted on:2022-07-09Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2506306482965499Subject:Criminal science and technology
Abstract/Summary:PDF Full Text Request
While the Internet brings convenience to life,it also hides various non-traditional security risks.On the one hand,cyberspace has become a hotbed for criminals to commit illegal crimes,such as publishing drug-related information,conducting online drug transactions,and disseminating obscene and pornographic information.On the other hand,domestic and foreign reactionary forces and terrorists spread inflammatory comments,reactionary speech and terrorism-related information through online platforms what pose a major threat to national security,social harmony and stability,and the safety of people’s lives and property,and pose new challenges to public security work.The traditional method of identifying the text involved in a case through sensitive dictionary matching has problems such as ignoring the context,incapability of making full use of key features,and being easily bypassed deliberately.The accuracy of recognition is low.After dictionary matching,it still needs to rely on manual judgment.When faced with a huge amount of data,it is easy to cause waste of manpower,material resources and time,and cause great damage to the efficiency of public security.In response to the above problems,this article uses deep learning and natural language processing related technologies,by constructing dataset of the text related with crime for experiments,and proposes a deep learning-based recognition method of text related with crime.By using the existing correlation between the sensitive word entity in the text and the type of text.this method trains the text’s sensitive word entity recognition and the crime type recognition in the same framewotk and a recognition model of text related crime based on multi-task learning is constructed.In the task of sensitive word entity recognition,the traditional character embedding and Word2vec-based word embedding methods cannot solve the polysemy problem well.This paper constructs a sensitive word entity recognition model based on the BERT-BiGRUIDCNN-CRF model.This method first uses the BERT pre-training language model to obtain the vector representation of the text,and then inputs the feature sequence into the BiGRUIDCNN-CRF model for training,and captures semantic features of different granularities through BiGRU and IDCNN,making the semantic information richer and improving accuracy of the sensitive words entity recognition.The comparison experiment with the traditional named entity recognition model proves the effectiveness of this method in the task of sensitive word entity recognition.In the classification task of text related with crime,this paper uses the BERT layer,BiGRU and IDCNN to form the shared coding layer of the multi-task learning model.The sensitive word entity recognition task is used as an auxiliary task,and the advantages of multi-task learning are used to help the text classification model learn more.Abstract hidden layer features.At the same time,a multi-head attention mechanism is introduced into the classifier to capture key semantic features.Finally,the Softmax layer is used to obtain the optimal prediction category of the text.In order to verify the effectiveness of this method,Text CNN,Bi LSTM,RCNN,Att-Bi LSTM,BiGRU-CNN,Self-Att-CNN and the neural network model of this article are used to conduct comparative experiments.At the same time,different text representations are used for the model proposed in this article.By compared experiments,the results show that the classification accuracy of the text related with crime recognition model based on multi-task learning is 97.56%,and the F1 value is 95.33%,which is higher than other models,which proves the effectiveness of the method.Finally,based on this method,the analysis system of Chinese texts related with crime was designed and implemented,and the usability and efficiency of the system were verified by testing and displaying system functions.
Keywords/Search Tags:texts related with crime, Multi-task Learning, Named entity recognition, Text classification
PDF Full Text Request
Related items