Research On Named Entity Recognition For Judgment Documents

Posted on:2022-10-14

Degree:Master

Type:Thesis

Country:China

Candidate:Y Y Deng

Full Text:PDF

GTID:2506306545455364

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

As the final product of trial activities,the judgment documents contain abundant information,through the named entity identification,it can lay a foundation for the construction of judgment documents knowledge graph.At present,some corpora have been developed in the study of judgment documents,but the labeled entities are not comprehensive.For the industry and subject entities concerned in this paper,there is no relevant corpora publicly available at present.In addition,due to the lack of word segmentation tools for judgment documents,the quality of word segmentation is not high,which affects the effect of named entity recognition.Therefore,to avoid the impact of word segmentation errors,this paper mainly studies the character-based named entity recognition of judgment documents.Considering the role of word information,this paper proposes two methods for integrating word information in character-based models.Specifically,the following three aspects of research work have been carried out:(1)A corpus is constructed for named entities based on civil judgment documents,the following are collectively called judgment documents corpus.The main procedures include analyzing the structure of the judgment documents,preprocessing them,and formulating the corresponding annotation specifications,then form a usable experimental corpus.(2)A model based on the direct integration of characters and words information.On the basis of acquiring character information,the model simply splices the pre-trained word vector information.For long sequence corpus such as judgment documents,the model takes a single character as input,selects BILSTM as encoder,and then adds a layer of attention mechanism to calculate the representation of input characters in context.At the same time,in order to make use of lexical information,this paper uses the CBOW model to train a large number of unlabeled judgment documents corpus,and obtains pre-trained word vectors.Finally,the word vector and the character representation in the context are spliced,then input to CRF layer for label prediction.(3)A model based on multi-level feature fusion of characters and words information.On the one hand,the above model of direct integration fails to fully explore the potential information of words.On the other hand,compared with the single embedded representation,the representation method of multi-level feature fusion of characters and words can often obtain more effective information.Therefore,a model based on multi-level feature fusion of characters and words is proposed,to make full use of word information in character-based models.Specifically,the model takes characters as input,firstly uses BILSTM and CNN to comprehensively excavate character-level features from multiple levels,and then obtains word-level features through word coding.Finally,the two are fused to form the final representation of the original input sequence,input it into the model and training to complete entity identification and annotation tasks.The experimental results show that the model based on the direct integration of characters and words information can effectively improve the performance of named entity recognition for judgment documents.The performance of the method based on multi-level feature fusion is better than the baseline method,and better than the model based on the direct integration.

Keywords/Search Tags:

Judgment Documents, Named Entity Recognition, BiLSTM-CRF Model, Industry and Subject, Characters and Words Feature

PDF Full Text Request

Related items

1	Research On Chinese Named Entity Recognition In Judicial Field
2	Research On Named Entity Recognition Of Court Judgment Documents Based On Deep Learning
3	Research On Named Entity Identification Of Legal Documents
4	Research On Named Entity Recognition Based On Legal Documents In Civil Cases
5	Named Entity Recognition For Divorce Legal Documents
6	Research On Named Entity Recognition For Antiterrorism Field
7	Research On Enterprise Entity Recognition And Classification For Court Documents
8	Research On The Construction Technology Of Criminal Law Knowledge Grap
9	Named Entity Recognition For Judicial Document Data
10	A Study Of Chinese Named Entity Recognition For Judicial Field