| Geological reports are mainly composed of texts,drawings and diagrams,which contain rich geological knowledge and expert experience.How to extract geological entities and interentity relationships from geological reports efficiently and accurately is a hot topic of research in the field of geological big data,and entity and relationship extraction is the basic work of building a knowledge map,which can support downstream tasks such as intelligent services and accurate reasoning of geological knowledge.In this paper,the research of geological entity and relationship extraction method is based on the line of "data set construction-deep learning extraction model-prototype system validation".The main work is as follows:(1)A relationship extraction dataset based on the text of geological reports was constructed.In this paper,under the guidance of experts in the field of geology,24 relationships were defined according to the characteristics of relationships between geological entities in regional geological reports,annotation tools were developed and annotation rules were formulated,and the defined relationships were annotated through a semi-automatic approach.The reliability of the dataset was verified by a consistency check.(2)A joint geological entity relationship extraction model that takes into account contextual information and relationship overlap is proposed.The model uses the RoBERTa model to effectively characterise the triad of geological entity relationships,and uses BiGRU and axial attention mechanism networks to establish context dependencies and effectively incorporate global information.The experiments show that the model has good performance for the problems of overlapping entity relationships,data imbalance and difficulty in taking into account contextual features in relationship extraction.(3)A model for geohazard named entity recognition and relationship extraction based on the BERT pre-training model is proposed.The types of geohazard entities and relations are first defined,and the construction of a geohazard corpus is completed under the guidance of geohazard knowledge.The model uses BERT to effectively characterise named entities and relations,and introduces BiGRU and attention models to effectively solve the problems of multiple meaning words in geohazard texts and poor fusion ability.The model proposed in this study has a significant improvement compared with the baseline model.In this paper,a prototype system for geological entity relationship extraction is designed to validate the practicality of the aforementioned research results.The system’s functions mainly include data annotation,model training and visualisation modules.The system is used to test the modeling method proposed in this paper and to verify the practicality of the method. |