Baidu Knows The Problem Of Intelligent Reasoning Algorithm Research

Posted on:2024-06-30

Degree:Master

Type:Thesis

Country:China

Candidate:Y C Li

Full Text:PDF

GTID:2568307088955069

Subject:Applied statistics

Abstract/Summary:

PDF Full Text Request

The intelligent question answering system can understand and answer the questions asked by the user to a certain extent,thus solving the problem of traditional document retrieval only sorting without giving accurate answers.At present,the question answering system based on knowledge base is time-consuming and labor-intensive due to the differences in the knowledge base field,the large scale and the difficulty of building knowledge base.The searchable question answering system can avoid the above problems,the searchable question answering system first compares the user’s question with the document in the document dataset,selects the highly similar document,and enters the question and document into the machine reading comprehension model to obtain the answer to the question.Although there are a large number of questions and answers in Baidu Know,not every question has corresponding answers,and how Baidu automatically answers unanswered questions is an important research task at present.In order to solve this problem,this paper proposes an intelligent question answering model based on document retrieval and machine reading comprehension for the research of Baidu’s intelligent answer algorithm.First,for Baidu’s Chinese reading comprehension dataset Dureader,this paper constructs a machine reading comprehension model based on BERT,which first reconstructs the questions and documents using the sliding window method,trains the model as a training set,and inputs them to the BERT embedding layer for feature extraction.Finally,by converting the final hidden state of BERT into the probability of the answer span by fully connected layer and the softmax function,the output of the start position and end position of the answer can be obtained.The BERT-based reading comprehension model proposed in this paper achieves ROUGE-L of 0.425 and BLEU-4 of 0.477 on the Chinese reading comprehension dataset Dureader,which is 10.1 and 8.4 percentage points higher than that of the baseline model Bi DAF.Second,for the generation of documents,this paper constructs a document retrieval model based on named entity recognition and word vector technology,uses BERT-CRF to train the named entity recognition model to obtain the entity annotation of the input problem,and then retrieves the document according to the entity.If the retrieval of documents based on entities fails,word vector technology is used to obtain the approximate words of the entities,and then the documents are retrieved again,which is conducive to improving the accuracy of document retrieval.Finally,the user-entered questions and retrieved documents are input into the reading comprehension model to predict the answer,and finally the answer to the question is generated.

Keywords/Search Tags:

machine reading comprehension, Named entity recognition, Word vector, Baidu knows, Intelligent answer

PDF Full Text Request

Related items

1	Research On Chinese Named Entity Recognition Based On Machine Reading Comprehension And Feature Fusion
2	Reasearch On Machine Reading Comprehension Methods Based On Incorporating External Knowledge
3	Research And Implementation Of Multimodal Named Entity Recognition Based On Deep Learning
4	Research And System Construction Of Named Entity Recognition Algorithm Based On Deep Learning
5	Research On Named Entity Recognition For Science And Technology Terms Based On Dependent Entity Word Vector
6	Research On Chinese Named Entity Recognition Based On Local Adversarial Transfer Training
7	Research On Word-vector-representation-based New Word Discovery And Name Entity Recognition
8	Research Of Chinese Named Entity Recognition Based On Deep Learning
9	Research On Named Entity Recognition Method For Network Security Domain
10	Research On The Method Of Reading Comprehension Answer Extraction Based On Discourse Relationship And Graph Structure