Font Size: a A A

Research On Information Extraction Algorithm For Legal Text

Posted on:2023-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:W H SongFull Text:PDF
GTID:2556306827975069Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development in the reform of the judicial system,the degree of legal documents digitization has been improved,and the legal text information available on the Internet has achieved exponential growth.However,there are many differences in the writing specifications of different types of legal texts,so it is difficult to directly understand documents and analyze knowledge through rules.Therefore,more and more researchers apply natural language processing technology to legal texts,and convert unstructured texts into structured data by information extraction,which promotes the development of judicial informatization and improves judicial efficiency.Information extraction,including named entity recognition,relation extraction and event extraction,is intended to extract entity-relation information and event information from the text,which plays a vital role in tasks like knowledge graph,information retrieval and intelligent question answering.Named entity recognition and relation extraction are generally done through joint learning.The joint entity and relation extraction aims to extract the entity-relation information from the text,and output the relation triplet information;the event extraction aims to extract the event information from the text and determine the trigger word,event type,event argument and argument role.Therefore,the paper will carry out the analysis and experiment based on the above content,and the main work is as follows:(1)Aiming at the extraction of relation triplet in legal documents,the paper presents a joint entity and relation extraction method based on a pre-trained language model.The end-to-end model is used to directly output relation triplet in the text.Through fusing relational information in the label space,the researcher transforms the triplet extraction into sequence labeling,and then complete entity and relation extraction by custom rules.In view of the lack of datasets in the judicial field,the researchers manually constructed a relation extraction dataset,and the model achieved good extraction result.(2)Aiming at the complex relation extraction in legal documents,the paper presents a relation extraction method based on grammar enhancement and multi-head attention mechanism.This paper first injects syntactic information into model through Ordered NeuronsLong short-term memory,and then introduces a multi-head attention mechanism to decompose complex overlapping relationships.Compared with the pipeline method and other joint learning methods,the experiments show that our model can effectively extract relation triplet and achieve the best performance.(3)Aiming at the need for expressing criminal facts in practical business,the paper presents the event extraction method based on reading comprehension model.Firstly,this paper constructs different trigger word recognition templates manually according to different semantic richness,and also explores the gain effect of reading comprehension mechanism on trigger word extraction.Then,the researcher constructs event-specific argument extraction templates based on the rules.Besides,the semantic information of the role is also integrated into the identification process of the candidate argument.At the same time,the argument extraction is completed based on the dynamic threshold.The experimental results show that the method proposed in this paper achieves the best results in the event extraction of the legal field.
Keywords/Search Tags:Entity and Relation Extraction, Event Extraction, Information Extraction, Legal Intelligence
PDF Full Text Request
Related items