Font Size: a A A

The Research Of Question Answering System For Law Domain

Posted on:2019-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y W ZhouFull Text:PDF
GTID:2428330545469674Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
While entering the Web2.0 era,the computational technology has been developed rapidly and received extensive public attention.More and more people use the Internet to seek help and share knowledge,and they are increasingly inclined to use simple and fast methods to obtain answers to questions.The demand of Natural Language Processing increased dramatically.Therefore,some profound studies have been conducted in the Question Answering Systems(Q&A systems)area.Most of these studies focused on basic household services such as food and clothing.However,there are limited research on Q&A systems for legal information retrieval in Chinese.This paper provides technical insights on Question Classification and question-and-answer matching for Chinese Legal data,and aim to construct a free and open Question Answering System for with legal information retrieval high accuracy.This article mainly has the following aspects of innovation and contribution :(1)We obtained legal document text by self-designed web crawlers or purchasing.After preprocessing,we built a corpus of legal information,including the legal community question answer pair,the legal dictionary and the legal,and statute library.(2)To classify Chinese law questions,we extracted the TF-IDF and the word embedding from text and utilized them as features.We then proposed an improved fastText embedding approach by performing algorithm experiments and parameter adjustments.The proposed algorithm acheived 95.75% accuracy on the coarse types and the average recognition time of each question reached 5.15 ms.(3)In the question-answering match,we took full advantage of the self-learning advantage of deep learning and combined attention mechanism to propose a n Attention-RCNN model which was described and implemented.After that,we trained the model using data with labeled legal questions and answers,and used the cosine similarity between the question and the candidate answer as the LOSS function.Combined with the data of Chinese wikipedia and the collected legal text data,the word embedding with a total vocabulary of 369,766 was included in the training set.The final model achieved a Top-1 accuracy of 61.33%,which has improved the semantic matching accuracy rate compared with other comparison algorithms.As far as we know,the work done in this paper is the first attempt in this promising field.
Keywords/Search Tags:Q&A system, deep learning, fastText, crawler, legal dictionary
PDF Full Text Request
Related items