Research On Question-answering Method Oriented To Small Data Volume Vertical Field

Posted on:2021-04-10

Degree:Master

Type:Thesis

Country:China

Candidate:X J Lei

Full Text:PDF

GTID:2428330614971050

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Question and answer tasks have always been a research hotspot in the field of natural language processing.This task extracts relevant information from specific data based on user problems,and then obtains the correct return result.At present,the technology related to the question and answer task has a wealth of landing scenarios in industrial production,which can help people obtain information and knowledge more efficiently.Now,the question answering system in the open field is increasingly perfect,which greatly meets the needs of the industry.However,in the face of the question and answer scene in the vertical field of small data volume,most of the current question and answer related technologies are difficult to deal with.The sparseness of feature words in the vertical domain will lead to the problem of large variance in the data set.The small amount of data will lead to over-fitting and under-fitting of most of the relevant algorithms for question and answer.Based on the above problems,the current mainstream statistics-based methods and traditional neural network methods are often unsatisfactory in this scenario.Based on the above problems,this article proposes a variety of corresponding solutions.The innovations and corresponding contributions of this article are as follows:(1)Proposed domain transfer method based on large-scale open question answering corpus and pre-trained language model.This paper uses a large amount of open question and answer data to perform incremental training on the pre-trained language model to solve the problem of deviation of the semantic distribution of the training data in the language model during the pre-training stage.With the help of a rich open question and answer corpus,the model can learn more characteristic word meaning information between different fields in the pre-training stage,thereby alleviating the data deviation problem in the vertical field data and improving the accuracy of the question and answer task.(2)The pre-training method QA-Predict based on large-scale open question and answer corpus is proposed.This method can improve the effective utilization of largescale open question and answer corpus,and can improve the reasoning ability of the model in the pre-training stage,so as to achieve the effect of improving the accuracy of question and answer in the vertical field of small data.(3)Propose a Bert model structure introducing feature domain words.This method extracts corresponding feature domain words based on big data in a specific domain,and introduces them in the Bert model through the Mask mechanism in Attention.Then solve the problem of discrete feature words in the vertical domain and improve the effect of question and answer.(4)A knowledge distillation strategy based on pre-trained language models is proposed.The model that uses the QA-Predict method for incremental training is used as a teacher model,and then in the fine tuning stage(Finetune),the soft label(soft label)feature training is used to obtain the student model TQA-Bert.It greatly reduces the parameter scale of the pre-trained language model,thereby improving the prediction speed of the model and reducing the deployment requirements of the model.

Keywords/Search Tags:

Question answering, Pre-trained language model, Knowledge distillation, Domain transfer, Keyword embedding

PDF Full Text Request

Related items

1	Research On Chinese Knowledge Bases Question Answering Based On Pre-trained Language Model
2	Research On Natural Language Question Answering Based On Knowledge Bases
3	Research Of Specific Domain Question Answering System Based On Internet Information
4	Question Answering System Based On Pre-training Deep Model
5	Research On Intelligent Question-answering System Based On Deep Learning
6	Open Domain Question Answering With Multi-hop Reasoning
7	Kiwifruit Planting Knowledge Question And Answer System Based On Knowledge Graph
8	Research And Implement For Question Answering Based On Deep Learning And Knowledge Graph Embedding
9	Research On Question Answering Style Sentiment Analysis Task Based On Text Segmentation
10	A Question Answering System For Specific Domain