At present,the construction of information services in the field of civil engineering is in the development stage.Designers need involve a large amount of knowledge retrieval and knowledge question and answering requirements in the design process.A large amount of domain knowledge,norms and standards are in the form of unstructured text.There are difficulties in the construction of the library,the colloquial natural language problems cannot be effectively analyzed,and the demand for question answering cannot be met.Therefore,this thesis selects the application of intelligent question answering in the vertical field of civil engineering as the entry point,and uses natural language processing methods to study the intelligent question answering technology.The work of this thesis is the construction of a civil engineering intelligent question answering system based on the knowledge base,which mainly involves natural language processing technology.According to the specific technical difficulties of upstream and downstream,it is divided into four aspects:automatic construction technology of question and answering sentence pairs,automatic construction method of knowledge base,intelligent question answering method,and construction of intelligent question answering system.(1)Aiming at the problem of the small amount of corpus data for question and answering sentences in the field of civil engineering,a model for expanding the question and answering sentence data set is proposed.This model is a sequence learning model combining BERT,Transformer and UniLM,using secondary pre-training.The training method transfers the grammar and syntactic rules of a large number of open domain corpus to the field of civil engineering,combines a small amount of manual annotation data in the field to obtain semantic information,randomly samples and trains the stacking module to generate high-quality domain targets after optimizing parameters,finally form the domain natural language question and answering sentence pair data.The quality of the generated questions reached the best 26.19-BLEU,an improvement of 12.78 compared to the baseline model LSTM.(2)Aiming at the difficulty of building a knowledge base,a joint learning end-to-end deep learning model CivilWoSpERT for automated construction of a knowledge base is proposed.The model is based on the sub-sequence segment Span,a named entity and relationship joint extraction scheme that incorporates the word lattice embedding mechanism.The F1 value of 87.47 was reached on the task of named entity recognition,and the F1 value of 78.66 was reached on the relation extraction.(3)Aiming at the intelligent question answering method,a quick question answering method based on the knowledge base is proposed,which can parse the natural language question and return the target answer through the knowledge base search.(4)In the construction of the intelligent question answering system,the three stages of models and methods are integrated into the system,and the mainstream front-end and back-end separation Web system solution is used to complete the construction of the entire intelligent question answering system.This thesis aims to conduct an in-depth study of the intelligent question answering system in the field of civil engineering through the above content,and solve the specific problems such as the small number of data sets,the difficulty of extracting information,and the insufficient coverage of domain knowledge by the traditional knowledge base.Construct a set of civil engineering intelligent question answering system that can meet the actual demand of intelligent question answering through deep learning. |