The safety production situation in China’s coalmines remains severe,and the knowledge reserves of safety production among employees are still low.Therefore,it is necessary to use automatic question answering technology to enhance safety production knowledge training and assist safety production decision-making.At present,the research on the public domain question answering system is developing rapidly,but,for specific fields,there is a lack of relevant high-quality question answering datasets,and mainstream models cannot be directly applied to specific fields.Therefore,this thesis studies the related regulations of coalmine safety,constructs a coalmine safety question answering data set,and,based on this,studies the coalmine safety automatic question answering system based on this.The main tasks are as follows:1.A coalmine safety question answering dataset was constructed.Based on the BERT-wwm model,the improved pre-training method was developed using Mac BERT to train the coal mine safety domain corpus and obtain the Coal BERT model suitable for the coal safety field.A sequence-to-sequence question and answer pair generation model was constructed using a manually annotated small sample dataset combined with Coal BERT+Uni LM,and a question answering dataset of(passage,question,answer)triplets was generated by augmenting the question sequence using a data enhancement model.2.This thesis studies an automatic question answering model in the coalmine safety field.Specifically,it focuses on a document-based question answering model that aims to extract accurate answers through machine reading comprehension.However,due to the large number of regulatory documents,this thesis reduces the interference with answer extraction by selecting relevant documents.First,a similarity matching process and Rouge-L scoring mechanism are designed to select regulatory documents corresponding to the questions.Then,Coal BERT is used for vector feature encoding,and the question and passage are concatenated to construct a fragment extraction-based precise question answering model.Furthermore,multi-task learning is incorporated to help the model recognize unanswerable questions.3.By integrating the above two studies,an automatic question answering system in the coalmine safety field is developed.The system provides a simple and easy-tooperate question answering page for coal mine employees,with main functional modules including user management and real-time question answering.The performance of the system is verified through question answering tests,and the results meet the expected standards. |