Font Size: a A A

Research On Information Extraction Of Court Cases Based On Code Analysis And Pattern Matching

Posted on:2024-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:Q Z ChenFull Text:PDF
GTID:2556307091997119Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In order to promote fairness and motivation in the handling of cases by post judges,the Court has implemented a performance management system for post judges,whereby the cases for the year are divided into statistics and converted into XML files for storage,after which the performance scores are managed by assigning points to the corresponding post judges in order to rank them according to the statistical case information.However,in the face of the large amount of case information data,it is very labour-intensive and time-consuming to analyse it using limited manual or traditional statistical tools,so how to effectively use the XML files storing case information,extract and calculate the performance of the post judges by keeping the case information therein,and effectively solve the pain point of having fewer staff and more cases,is a current issue that needs attention.In this thesis,through communication with court staff,case information is divided into independent case information and associated case information.By analysing the different characteristics of the two stored in XML files,the corresponding XML file parsing and extraction methods are designed to extract the case information and count the performance scores of the corresponding post judges,in order to ensure the fairness and accuracy of the post judges’ performance scores in handling cases.In particular,this thesis has done the following research:(1)Pre-processing of case datasets.There are cases in the case dataset that are not valid for the calculation of performance scores for post judges.Such dirty data case information should be removed from the dataset before the case information is extracted to reduce the unnecessary cost of later experiments,so as to ensure the accuracy of the subsequent extraction results and the accuracy of performance scores for post judges.Therefore,this thesis designs a set of data set pre-processing methods based on case content information to effectively remove dirty data from the case data set and ensure the quality of subsequent experiments.(2)Case information extraction based on XML code parsing and pattern matching.The cases in the data set are divided into independent case information and more complex related case information.The independent case information is independent of each other and the results of the extraction do not interfere with each other.Through the analysis of the case content storage structure in the XML file before the experiment,a set of independent case information extraction method based on the XML content storage structure was designed.Using the combination of DOM-based and SAX-based analysis methods improves the accuracy and efficiency of case information extraction.Linked case information is information about cases that are linked to each other and need to be bundled in the calculation of performance scores.Using the separate case extraction method would result in imperfect extraction of case information or loss of linkage to other cases,which in turn would result in these cases being scored incorrectly in the performance of the post judges.By constructing a template for extracting related cases with different characteristics,the method of extracting related case information based on XML code parsing and pattern matching is designed to list and store the extracted related case information in a more intuitive form,which makes it easier to count and score the performance of post judges afterwards and ensures the accuracy and fairness of the performance scoring of related cases.(3)The statistical management of the performance scores of the post judges has made reference to the ideas of the judges’ appraisal systems of different district courts,investigated the performance appraisal formulas of different district courts as reference adjustments,and cooperated with the performance appraisal systems of the post judges already in place in the court to make the final performance scores more convincing.In order to facilitate the use of court staff,various performance appraisal indicators and different case extraction methods are applied to the judges’ performance management system,which supports the fairness and equity of the court’s performance appraisal in a highly informative manner and effectively helps the court to better manage its personnel.This thesis proposes a method for extracting case information based on a combination of XML encoding parsing and pattern matching,which is validated in terms of different file sizes,different years and different case closure methods.The method of extracting case information based on XML code parsing and pattern matching not only reduces the burden of case information processing on court staff,but also makes the statistics of post judges’ case performance scores more fair and efficient,and also provides a corresponding reference for the post judges’ performance management system of the relevant courts.
Keywords/Search Tags:case information extraction, XML parsing, Case-handling performance of post judges, pattern matching
PDF Full Text Request
Related items