As a core requirement of intelligent justice,the criminal charge prediction task is an important component of the criminal intelligent trial assistance system.Improving the prediction performance of existing methods in the criminal charge prediction task has practical significance for the development of intelligent justice.This thesis takes the text of the single crime case in the criminal field as the research object,and focuses on the problems of confusion and category imbalance existing in the task.By designing a multigranularity feature extraction network,the Bert pre-training model is introduced to improve the prediction performance of the model for less samples and easily confused charges.The main research of this thesis is as follows:1.An multi-granularity feature fusion mechanism with legal attribute awareness is designed.Firstly,the coarse-grained feature of shallow context is extracted through capsule network to reduce the information loss caused by pooling induction;Then,the bi-directional attention flow is used to interact with the legal attribute sentence and the case description text to obtain the fine-grained features of deep fusion;Finally,the logit training adjustment strategy is used to optimize the model training process.Compared with the baseline model,the improved model F1 value improved by 2.0%,2.5% and 1.7% respectively on the three data sets of Criminal.2.The crime prediction model integrating legal attribute attention and Bert encoder is proposed.Through the attention module of legal attribute,extract the attribute features that are helpful to distinguish crimes,and improve the fine-tuning problem of Bert model in the category of few samples;At the same time,multi-scale features of key behaviors are obtained through Text CNN;Finally,the weighted adjusted Focal Loss function is used to optimize the training effect of the model on the long tail distribution problem.Compared with the experimental results of the baseline model MFMI,the F1 value of this model has improved by 8.9%,7.2% and 3.2% on the three data sets of Criminal,respectively.In addition,the effectiveness of migrating artificially designed labels between different data sets has been explored through the pseudo-label strategy.3.Based on the above model,a crime prediction system is designed and implemented.The front-end page is realized through j Query and Bootstrap front-end framework,and the back-end construction of the system is completed by Flask.The crime prediction model is deployed on the back-end of the system,and the core functions of crime prediction and prediction result query are realized.The system was tested to verify the usability of the prototype system. |