Font Size: a A A

Research On Technology Of Similar Case Retrieval Based On Information Processing Of Judicial Case Elements

Posted on:2024-03-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:W F HuFull Text:PDF
GTID:1526307202954879Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the reform of China’s judicial system,judges have gained increased autonomy and discretionary authority,which has resulted in challenges such as inconsistent sentencing standards,divergent judgments in similar cases,and non-standardized sentencing.To summarize judicial expertise and standardize sentencing criteria,the Supreme People’s Court has sequentially established a case guidance system for the retrieval of similar cases,mandating that courts at all levels handle similar cases uniformly.Additionally,in tandem with the advancement of information technology,video surveillance has emerged as an integral component of public security and safety.Within the judicial case processing procedure,video surveillance information can faithfully document the sequence of events and areas of dispute,furnishing vital support for judicial adjudicators.In response to the similar cases retrieval requirements proposed by the Supreme People’s Court of China,judicial personnel encounter several challenges in their practical work:(1)The imperative need for rapid acquaintanceship with authoritative case databases,such as the Judgment Document Network;(2)The intricacy of prevailing retrieval tools and their constrained search parameters;(3)The absence of similarity displays within an extensive repository of retrieval cases,necessitating substantial manual effort from judicial personnel for the determination of case similarities;(4)The protracted duration and intricate nature of audiovisual materials,particularly video information,within the case handling process,rendering the selection of evidentiary information that substantiates the case a formidable task.Similar case retrieval technology stands as a pivotal core technology for the realization of an intelligent judicial class-action retrieval system.Its primary objective is to facilitate similar case adjudication throughout the judicial process by identifying congruent judgments in prior cases,thus providing valuable references to judges.This technology plays an exceptionally significant role in the advancement of judicial intelligence.Additionally,query-based video summarization technology bolsters personalized video summary retrieval functionalities.By establishing specific retrieval criteria,it effectively narrows the scope of inquiry,thereby significantly reducing the time required for sifting through video data.This thesis focuses on the task of in the field of judiciary.It combines natural language processing technology,information similar case retrieval retrieval technology,and deep learning methods.The study delves into essential information extraction technologies,such as Chinese named entity recognition and entity relationship classification,as well as similar case retrieval techniques.Ultimately,it constructs a similar case retrieval system.The main research works in this thesis is summarized as follows:(1)In response to existing lexicon-enhanced named entity recognition models,which primarily focus on enhancing entity boundary recognition capabilities while overlooking the influence of inter-word relationships on entity categorization,this paper introduces the PasLEBERT named entity recognition model.PasLEBERT builds upon the strengthened entity boundary recognition abilities and supplements them with syntactic structural relationship information between words,which encapsulate directional information.Employing a tensorbased feature fusion method,it explores the interdependencies among character-level,wordlevel,and sentence-level features,thereby facilitating information complementarity while mitigating noise from redundant data.Consequently,this approach enhances the accuracy of essential case element extraction.Experimental validation of this methodology was conducted on four general Chinese named entity recognition datasets and one custom dataset.The results reveal that,in comparison to the lexicon-enhanced model LEBERT published by ACL,this method exhibited an improvement in F1 score ranging from 0.43%to 2%.When contrasted with the character structure-enhanced model MECT,the F1 score demonstrated an improvement ranging from 0.89%to 8.86%.(2)In response to the challenge of potential ambiguity in categorizing entity relationships within Chinese text,particularly when dealing with entities sharing the same semantic relationship but differing in entity direction,we present a novel relationship classification method named Dic-Transformer.This method incorporates part-of-speech distinctiveness information to augment the model’s capacity for distinguishing between relationship categories involving the same semantic entities but varying entity directions.We employ a more efficient Transformer model to encode the syntactic and semantic information of input sentences,thereby enhancing the context-awareness of the semantic encoding module.Moreover,we introduce a feature fusion module grounded in attention mechanisms,designed to seamlessly integrate semantic information with relationship distinction cues.This enables the model to preserve the unique characteristics of fused features while capturing their interactive relationships.Additionally,we implement a model optimization approach that combines cross-entropy and max-margin functions.This approach ensures that the model acquires standard category information during training while simultaneously maximizing the likelihood of accurate predictions.Consequently,it fortifies the model’s capacity to comprehend network states,ultimately leading to an enhancement in relationship classification accuracy.Experimental validation is carried out on the renowned relationship classification dataset,SemEval-2010 Task 8,resulting in an impressive F1 score of 86.2%.(3)In response to the unique textual structure and lengthy characteristics of judicial case judgments,a similarity case retrieval model named BERT-LF,which integrates judicial case factual elements,is proposed.This model infers the similarity of entire legal cases by aggregating paragraph-level semantic interactions.It employs BERT’s text encoding technique to semantically encode query and candidate paragraphs,ensuring the model’s semantic learning capabilities while addressing the challenge of lengthy legal texts in judicial cases.Furthermore,it identifies and expands upon the legal elements that significantly impact case judgments.By deeply integrating case factual elements,document themes,and foundational textual semantic information,the document representation is better suited for legal contexts.Experimental validation of this method is conducted on the Chinese legal system’s case retrieval dataset,LeCaRD.Compared to the advanced class-action recommendation model BERT-PLI,BERT-LF achieves significant improvements in accuracy metrics,with a 17%increase in P@5,a 9%increase in P@10,and a 15.6%increase in MAP.Additionally,it outperforms BERT-PLI by 7.3%in NDCG@10,5.7%in NDCG@20,and 2.8%in NDCG@30,based on ranking accuracy metrics.(4)In order to implement the guidance requirements for similar case retrieval put forth by the Supreme People’s Court of China and address the issue of low efficiency in case retrieval and similar case assessment due to the vast amount of judicial case data,we adopted the MVT software architecture design concept and the B/S architecture pattern.Using the Django web development framework and the open-source Solr search engine framework,we integrated the similar case recommendation algorithm proposed in this research and implemented a retrievalbased class-action recommendation system.This system includes three main functionalities:rapid retrieval of judicial case data,intelligent class-action recommendations,and retrieval of case video materials summaries.
Keywords/Search Tags:Key Information Extraction, Named Entity Recognition, Relationship Classification, Similar Case Retrieval
PDF Full Text Request
Related items