Font Size: a A A

Research On Question Similarity Computation In Domain Question Answering System

Posted on:2019-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:P P LiuFull Text:PDF
GTID:2428330566997303Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Natural language processing technology as a research direction in the field of artificial intelligence has been highly valued by all walks of life.With the continuous deepening of natural language processing technology research,the research direction of natural language processing technology is gradually refined,including information extraction,question answering system,machine translation,text generation and text mining technologies,speech recognition and generation,etc.The emergence of a question answering system allows people to talk directly with computers through natural language.As a new type of information retrieval method,the question answering system has gradually become a research focus in the field of natural language processing.Question answering system research is generally divided into question understanding,information retrieval and answer extraction,however,these three are inseparable from sentence similarity calculation.Although people can already use natural language processing technology to construct a variety of question answering systems,but the use of question answering systems is not satisfactory.In the final analysis,the problem is that the calculation of sentence similarity is not ideal.Therefore,this paper mainly uses natural language processing technology to study the question-and-answer similarity calculation method in the field question answering system.The main innovative work of this article is as follows.First of all,the improvement of the similarity calculation method based on the word vector is proposed.The improvement includes two points,one is word vector optimization,and the other is improvement of similarity calculation scheme.The improvement of the similarity scheme is mainly based on the WMD algorithm,adding co-occurrence information of words and words,thereby improving the accuracy of the calculation.Second,the similarity calculation of question sentences based on deep learning is proposed.Deep learning is applied to the calculation of the similarity of question sentences.The candidate learning question sets are mainly selected by using the deep learning classification model,and then the exact matching is performed by using the similarity method of the question vectors based on the word vectors.This part is based on the LSTM twin network classification model,and proposes two deep learning classification models including the twin network model of LSTM+CNN twinning network and CNN+Attention mechanism to improve the accuracy of text classification.Experiments show that the two classification models are compared.The original model can increase one percentage point to about 95% when the correct rate has reached 94%.Finally,based on the proposed algorithm,a similarity testing platform is developed,which implements the test function of the question similarity algorithm and the effectiveness of the improved algorithm is shown.Experiments show that improving the accuracy of the similarity calculation method based on the word vector is 6 percentage points higher than that before the improvement.After the candidate set is selected through deep learning,the F1 is changed from the original 0.67 to about 0.84,and the accuracy is changed from the original 0.79 to 0.85.
Keywords/Search Tags:similarity computation, unknown words, near antonyms, deep learning
PDF Full Text Request
Related items