Font Size: a A A

Research On Public Security Business Text Mining Technology For Joint Cases Investigating

Posted on:2023-07-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:W ZhaoFull Text:PDF
GTID:1526307169477604Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
The business process of public security is complex,and a large amount of text data will be generated during the case investigation,covering various types of information such as network public opinion,case files,and criminal incident records.The public security business text data has the characteristics of variety,a large volume of data,high noise,and non-standard content,which brings great challenges to the intelligent analysis of public security intelligence.Therefore,automatically detecting and identifying valuable case clues from the massive text information is of great practical significance to successful case investigations.Oriented to the joint cases investigating,this thesis focuses on real issues in the intelligent analysis of the public security business,such as the standardized management of massive text,people-event-object correlation analysis,and criminal gang mining.The key technologies to those issues are researched,including public security business text classification,joint entity relationship extraction,event extraction,relationship mining and recommendation.The main contributions of this thesis are summarized as follows:(1)A generative multi-task text classification method is proposed to solve the massive public security business text classification problem.It could realize the automatic recognition/archiving of massive public security business text,satisfy the critical information retrieval requirements of joint cases investigating,and improve the accuracy of text classification.Multi-label classification and hierarchical classification for one text are required in public security business text recognition/archiving and joint case investigating.The traditional strategy is to perform these two classification tasks independently,which leads to the poor semantic correlation between classification results and even a high probability of semantic inconsistency between classification results,thus causing interference to investigators.Besides,the accuracy of existing text classification methods still needs further optimization,as the text classification performance would directly influence the results of case analysis.To overcome the above shortcomings,a generative multi-task text classification method is proposed in this thesis.The proposed method transforms the multi-label classification and hierarchical classification tasks into a sequence-to-sequence generative model.Then,these two classification tasks are unified into one framework by adopting the joint multi-task learning mechanism.Next,the convergence speed of the model and the performance of hierarchical classification are improved by optimizing the multi-label classification loss function and introducing the hierarchical structure mask matrix,respectively.The experimental results show that the proposed method not only improves the accuracy of the two classification tasks,but also achieves semantic consistency in the prediction results of the two classification tasks.(2)A joint entity relationship extraction method is proposed for the people-centered joint cases investigating.The proposed method could extract the entity relationships of person-to-person,person-to-object,and object-to-object in the text.Meanwhile,it meets the requirements of accurate and real-time analysis of public security business.For people-centered joint cases investigating,the correlation analysis between the investigation entities such as person and person,person and organizations,vehicles and vehicles is the focus.To perform the correlation analysis,the key step is to realize named entity recognition(NER)and relationship extraction(RE)on the classified public security business text with high accuracy and extraction efficiency.However,the existing joint entity relationship extraction method does not exploit label space information sufficiently.Besides,it cannot deal with the long-distance dependent related entity relationships effectively.In this thesis,a novel joint entity relationship extraction method is proposed based on the gate mechanism and multi-head self-attention mechanism.The proposed method makes effective use of label information and is capable of recognizing the entities and extracting the relationships even when there is a long-distance dependency between entities.The experimental results based on public data sets such as Co NLL04 and ADE show that the proposed method outperforms the recent representative models,such as Sp ERT and Deeper.In addition,the applicability of the proposed method in real-world scenarios is further verified by experimental tests on real case examples.(3)A joint event extraction method is proposed for the criminal-incident-centered joint case investigating.The proposed method effectively solves the problem that the existing methods are difficult to extract multi-events in a single sentence and ignore the correlation between events.In joint case investigating centered on criminal incidents,investigators are concerned about the role-related information in the criminal incidents,such as suspects,victims,crime tools,crime locations,and crime times.Most of the existing event extraction studies are conducted for single event extraction in a single sentence and chapter-level event extraction.In contrast,multi-events extraction in a single sentence is difficult and relevant studies are insufficient.However,there are a large number of single sentences containing multiple events in public security business text.To solve this problem,a sequence-to-sequence joint event extraction method based on global event types is proposed in this thesis.Firstly,the sequence information of the whole sentence is utilized by introducing a global event type layer.Then,all candidate event types contained in the sentence are obtained to improve the efficiency and accuracy of event extraction.Based on the ACE2005 public data set,the F1 value of the proposed method is improved by 1.8% compared with the baseline,and the performance of other metrics is equivalent to the baseline.Besides,experimental tests are conducted on drug-related cases.The results show that the method could be applied to complex criminal incidents analysis.(4)A multi-dimensional association recommendation method is proposed to mine criminal gangs in the joint case investigating.It could identify and recommend criminal incident-related personnel,and assist investigators to quickly discovering the accomplices or criminal organizations of specific suspects based on their expert experience.Based on the methods proposed in this thesis,information extraction can be performed on the public security business text to obtain the phone numbers of suspects.Then,a structured communication data set can be built by fusing other data.For the communication data set,a relationship mining method for criminal suspects is proposed.The method introduces multiple factors,such as the habit similarity and the shortest path priority,to build the HNNS multi-dimensional association recommendation model.Finally,the identification and recommendation of criminal gangs can be realized.The experimental results show that the Top-5 optimal prediction accuracy of the proposed method is 81.3%,and the average prediction accuracy is 62.8%.Therefore,the method has the ability to portray the relationships between criminal suspects and their communication recipients,and can effectively assist investigators to infer criminal accomplices.In conclusion,the research provided in this thesis effectively improves the ability of joint cases investigating of the police,and promotes the development of policing intelligence analysis technology,which has good theoretical research value and practical significance.
Keywords/Search Tags:Joint Cases Investigating, Text Mining, Text Classification, Entity Relation Extraction, Event Extraction, Relation Mining
PDF Full Text Request
Related items