| In the society ruled by law,with the continuous accumulation and increase of criminal cases,the country and the people pay more and more attention to the accuracy of the sentencing period of criminal cases,but also put forward higher requirements for the accurate sentencing of judicial personnel.Nowadays,the phenomenon of judicial localization and personalization has led to inconsistent sentencing period and judicial injustice.In order to promote the same case and sentence,the prediction of criminal sentence to assist judicial personnel in the trial of the case has been widely concerned and studied by many researchers.In recent years,the existing models have some problems such as not being able to well relate the context information of case description,huge prediction model system and uneven sample classification,which lead to the low accuracy of sentence prediction.In order to improve the accuracy of prediction,this paper conducts an in-depth study on the shortcomings of the current sentence prediction model and proposes a sentence prediction model for criminal cases based on Ro BERTa,which can predict the sentence according to contextual semantic information.The transfer learning method is adopted to fine-tune the pre-training of Ro BERTa model by using legal tasks to optimize the model,so that the model is more targeted to legal tasks and the accuracy of sentence prediction is improved.In view of the large Ro BERTa model system,the model pruning technology with neuron as unit is adopted to compress the model and ensure the lightweight of the model.For the problem of uneven sample classification,a new weighted strategy class-balanced Loss,antagonistic training Fast Gradient Method,improved cross-entropy Loss function Focal Loss Method and optimized Ro BERTa model were proposed to complete the sentence prediction task.The accuracy of prediction can be improved by optimizing the training method.The data set provided by "China Legal Research Cup" Judicial ARTIFICIAL Intelligence Challenge in 2018 was used for the experiment.Term prediction results show that the proposed optimization model compared with the typical sentence model,Ro BERTa,based on the optimization model can combine case describing the context of sentence semantic information forecast higher accuracy of 53.3%,with the method of model pruning in ensure the forecast accuracy at the same time the model size shrank by a third.The optimized training method based on imbalanced sample classification also improved the accuracy of prediction by 4%-5%. |