Font Size: a A A

A Legal Question Answering System Based On BERT

Posted on:2022-11-23Degree:MasterType:Thesis
Country:ChinaCandidate:C X WangFull Text:PDF
GTID:2506306770469924Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
With the times’ development,people pay more attention to a convenient and efficient life.In recent years,the world has been in the rapid stage of Internet development,and users can query various questions on search engines.The emergence of search engines has promoted the speed and quality of people’s information retrieval to a certain extent.However,when faced with professional problems,the drawbacks of search engines have gradually emerged.The search engine returns a series of relevant links,and users still need to rely on their experience to filter.Furthermore,with the advent of the era of big data,the amount of various information is increasing day by day.Therefore,search engines alone can no longer meet people’s needs.Retrieving the required information more accurately and quickly from massive knowledge has become a research direction.Therefore,automatic questionanswering systems have been proposed.Nevertheless,researchers still have a long way to go in developing domain-specific question-answering systems,especially in the legal domain.On the other hand,at the Central Work Conference on Comprehensively Governing the Country by Law in 2020,President Jinping Xi pointed out that it is necessary to strengthen the research and publicity of the theory of the law,and focus on making the people feel fair and justice in every legal system,every law enforcement decision,and every judicial case.This goal is to deepen the comprehensive reform of the judicial system.Thus,to help achieve the goal,this paper develops a BERT-based legal question answering system.The main contributions of this thesis are as follows.(1)The literature on question answering systems at home and abroad is reviewed.Most of the existing human-machine question-answering systems are not based on the pre-trained language model BERT,and those based on BERT are rarely in the legal field.The existing legal question-answering system based on BERT is aimed at legal issues related to COVID-19,while the legal issues in this thesis are mainly general.And their dataset contains more than 1,000 pieces of data,while that of this thesis includes more than 18,000 pieces of data.(2)This paper collects and codifies more than 30,000 pieces of legal question-and-answer data.Moreover,this thesis constructs a legal question-and-answer database using the collected data.(3)This paper implements a legal question-answering prototype system,including a human-computer interaction interface.First,a user can ask the system a legal question on the interface.Then the system uses BERT to vectorise the question.Next,the system finds the most similar questions through similarity calculation between the asked question vector and the question vectors of question-answer pairs stored in the dataset.Finally,the system retrieves the corresponding answers from the legal question-and-answer database according to the question ID.(4)This thesis has done a lot of experiments to compare nine text similarity calculation methods.Through the comparison results of the experiments,this thesis selects the most suitable calculation method for the proposed system.In addition,this paper also makes an experimental comparison of the word vector models.The experiment shows that the BERT model used in this paper has the best effect on vectorising questions.
Keywords/Search Tags:Automatic question answering, Pre-training language model, BERT, Machine learning, Text similarity
PDF Full Text Request
Related items