Font Size: a A A

The Research Of Chinese Automatic Question Answering And Proofreading Based On Deep Learning

Posted on:2020-10-04Degree:MasterType:Thesis
Country:ChinaCandidate:L J TangFull Text:PDF
GTID:2428330572495801Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Chinese automatic question answering and automatic proofreading are widely used in the field of Natural Language Processing.Chinese automatic question answering refers to the question asked by users in natural language,and the computer automatically returns concise and accurate answers.And Chinese automatic proofreading refers to the typos and grammatical errors in text are identified and corrected automatically.In view of Chinese question answering,this thesis first studies the question answering based on sentence similarity.Since the keywords in the question may be different from the keywords in the text sentence,in the specific field,expanding all the keywords in the question will give irrelevant answers.However,in specific fields,the expansion of each keyword will result in giving an irrelevant answer.So this thesis uses the dependency parsing to find the key words of the question and extends them by deep learning model.And the experiments show that in specific areas,extending the key words of questions can improve the accuracy and recall of search.In order to truly realize semantic retrieval,this thesis also studies automatic question answering based on knowledge graph.Due to the characteristics of texts,the construction of knowledge graph uses the entity extraction technology and the entity relation recognition technology successively.The implementation of entity extraction is based on the tagged corpus,and the tagging granularity of sentences will affect the accuracy of entity extraction,so this thesis uses the dependency parsing to extract phrases from the sequence and combines domain lexicons to increase the tagging granularity.And the experiments show that this method can improve the accuracy of entity extraction.The entity relation recognition technology is used to identify the semantic relation between entities.In specific fields,because of the diversity of sentence structure,if the relation tag cannot cover this field,the accuracy of entity relation recognition will be affected.Therefore,this thesis adds some new relation tags on the basis of tags defined by HowNet.And the experiments also show that this method will improve the accuracy of entity relation recognition.This thesis also studies Chinese automatic proofreading.Nowadays,existing automatic proofreading systems use large-scale lexical corpus to proofread words.It is difficult to achieve syntax and semantics proofreading,and it does not support large-scale free text.Therefore,this thesis uses the entity extraction technology and knowledge graph to achieve semantic proofreading.There are four types of semantic errors:typos,missing components,contradictions and missing content.Compared with the widely used Chinese automatic proofreading system,this semantic proofreading method has a high recall rate.Finally,taking the data structure course as an example,this thesis develops the prototype system of Chinese automatic question answering and automatic proofreading respectively.For the Chinese automatic question answering prototype system,three search methods are.integrated:FAQ library search,search based on sentence similarity and search based on knowledge graph.In order to make the system more intelligent,the interaction module and model training modules are added in the system.For the Chinese automatic proofreading prototype system,it integrates four proofreading functions:typo,missing components,defining contradictions and comprehensive proofreading.On the basis of realizing semantic proofreading,it can display text and proofreading results line by line for readability.
Keywords/Search Tags:Chinese automatic question answering, Sentence similarity, Knowledge graph, Entity relation extraction technology, Chinese automatic text proofreading
PDF Full Text Request
Related items