Smart justice is an indispensable part of modern urban governance.The current smart justice system is usually built on the basis of a knowledge map.Therefore,as one of the basic tasks of building a knowledge map,the named entities recognition in legal texts,that is,the extraction of key case elements,is crucial to the establishment of a smart justice system.Due to the numerous and increasing types of case elements in legal documents,traditional deep learning-based named entity recognition methods require a large amount of manual labeling,and the performance of traditional named entity recognition methods based on small-scale data sets is not ideal.In response to this problem,this paper proposes a method for extracting case elements based on small-scale annotations.Through the analysis of the characteristics of case elements,a named entity recognition model based on rules and machine learning is realized.In this paper,data preprocessing is carried out firstly,analyzes the content of different types of legal documents,and extracts the description of the facts of the cases.For the case elements to be extracted,a seed dictionary is constructed manually.Applying a bootstrapped patterns learning method,the semantic similarity between entities is used as the criterion,and the initial artificially constructed positive seed dictionary is expanded by iteration.This article performs dependency syntactic parsing on the case fact descriptions,and the data set is annotated with the expanded seed dictionary based on the syntactic component rules of the case elements which are set through observation and analysis.Making use of the BERT Chinese pre-training model to encode the data,map it into vectors,and input them into the BILSTM+CRF deep learning model for training,which is as the extraction model of the case elements.Finally,by comparing the experimental results of named entity recognition of various case elements,this paper verifies the transferability and scalability of the model of case element extraction.This paper implements a case element extraction model based on small-scale manual labeling.Based on a small amount of manual labeling data,the F1 value of the case element extraction model can reach more than 80%.This paper analyzes the performance of the model,and the experimental results show that the model has better results based on the same small-scale annotated data set,and the model has good scalability,but it is not transferable. |