Font Size: a A A

Research On Named Entity Recognition For Legal Instruments

Posted on:2023-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q P MaFull Text:PDF
GTID:2556306908966299Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous development and deepening of the construction of information disclosure in the judicial field,more and more data of legal instruments have appeared in the public view.Legal instruments contain rich judicial business information,which is of great significance to improve the efficiency of judicial business.Therefore,how to extract the judicial business information from these legal instruments has become an urgent problem to be solved at present.Information extraction,as a technique to extract a series of structured information such as entities,relationships and events from text data,is an effective solution to the above problem.Among them,named entity recognition,i.e.entity extraction,is a basic task in the information extraction process and a key prerequisite for subsequent relationship extraction and event extraction.In summary,the study of named entity recognition is particularly important in the process of information extraction from legal instruments.However,in the existing research on named entity recognition in legal instruments,the lack of a corpus of legal instrument sequence annotation limits the research and application of named entity recognition in legal instruments,and the generic named entities are mostly classified by natural attributes,which do not meet the actual needs of entity extraction in legal instruments.In addition,entities in legal instruments are often complex,and the traditional named entity recognition model using static word vectors suffers from the problem of word separation error transmission and cannot characterize the multiple meanings of words existing in different contexts.For the above problems,this paper constructs a legal instruments sequence annotation corpus for theft cases,and designs a named entity recognition model for legal instruments.The main work of this paper contains the following three parts:(1)Using the CAIL2021 dataset and the legal instruments opened by the China Judgements Online as data sources,this paperw defined ten categories of named entities:suspect,victim,stolen currency,value of goods,profit from theft,stolen goods,tools of crime,time of crime,place of crime and organization,and completed the construction of a corpus of serially labeled legal instruments for theft cases,with a corpus size of more than970,000 words.(2)To address the problems of static word vectors in legal instruments named entity recognition,a BERT-Att-Bi LSTM-CRF model based on BERT(Bidirectional Encoder Representation from Transformers)and attention mechanism is designed.The BERT pretrained language model is added to replace the traditional static word vector representation for mining the deep semantic information in legal instruments,and the attention mechanism is combined to further improve the feature encoding ability of the model,and the F1 score of 88.74% is achieved on the legal instruments corpus in this paper,which is 3.18% higher than that of the Bi LSTM-CRF model,which verifies the effectiveness of the model in legal instruments named entity recognition.(3)To address the problem that the BERT model uses character vector representation without considering lexical level information,the BERT model is combined with the FLAT(Flat-Lattice Transformer)model to design a named entity recognition model incorporating lexical level information,and the lexical level boundary and semantic information is incorporated into the model.BERT-FLAT-CRF model achieves the best results on this legal instruments corpus with the F1 score of 89.55%,which indicates that the incorporation of lexical level information in legal instruments named entity recognition has a certain gain effect on the model performance.
Keywords/Search Tags:Legal Instruments, Named Entity Recognition, BERT, Attention Mechanism, Lexical Information
PDF Full Text Request
Related items