Font Size: a A A

Extraction Of Spatiotemporal Attributes Information Of Gold Mines And Visual Expression Of Knowledge Graphs For Chinese Literature

Posted on:2022-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:C WangFull Text:PDF
GTID:2480306560463244Subject:Surveying and Mapping project
Abstract/Summary:PDF Full Text Request
In the era of Big Data,data has become the most competitive asset.Unstructured geotextual data has become an important data source for large mineral data.Especially,the literature of geological journals is updated quickly and in large quantities,the knowledge expressed is more advanced and novel,and the content is expressed in a standard way,which contains dense and rich knowledge.This paper uses the technical route of "corpus construction-information extraction-knowledge map visualization expression-prototype system",carries out the research of spatiotemporal attribute information extraction and knowledge map visualization expression of gold mine for Chinese literature,uses deep learning model to realize the extraction and semantic analysis of mineral information,and uses knowledge map technology to realize the visualization expression of mineral information.Provides strong data base and technical support for deep mining and utilization of large gold mine data.The main research contents and innovations include the following:(1)Construction of gold mine information label corpus.This paper collects articles published in the Journal of Mineral Deposit Geology,Acta Petrologica Sinica and Contributions to Geology and Mineral Resources Res etc from 2000 to 2020,summarizes the description characteristics of gold deposits,formulates the gold mineral information labeling standards,and builds a gold mineral information labeling corpus based on the self-developed interactive mineral information labeling software.Provides standardized training and test data for the extraction of gold and mineral information.(2)Gold mining entity and attribute information extraction method based on Two-branch aggregation model.Based on the Two-branch aggregation model(BERT+Bi LSTM+CNN+CRF),a method for extracting the entity and attribute information of gold deposits is designed for describing the characteristics of the entity and attribute information of gold deposits.First,the BERT(Bidirectional Encoder Representations from Transformers)model is fine-tuned based on small-scale labeled data of gold ore entities and attributes.Then,the two-way long Short-Term Memory(Bi LSTM)and Convolutional Neural Network(CNN)are used to extract the features of the BERT output,and the features obtained by the two branches are aggregated.Finally,the Conditional Random Field(CRF)uses aggregation characteristics to predict the labels of gold ore entity and attribute information type.The results of extracting gold entity and attribute information from different model combinations such as CRF,BERT+CRF(ori-BERT),BERT+CRF(wwmBERT),Bi LSTM+classifier,Bi LSTM+CRF,BERT+Bi LSTM+CRF(ori-BERT),BERT+Bi LSTM-CRF(wwm-BERT)are compared and analyzed.The applicability and validity of deep learning model for extracting gold entity and attribute information are verified.The results show that the Two-branch aggregation model is the best way to extract the gold entity and attribute information.The F1 value for extracting gold entity information is 94.27%,attribute information is 94.87%,spatial attribute information is 92.89%,and non-spatiotemporal attribute information is90.78%.(3)Based on CNN,Attention+Bi LSTM and Transformer,the feature extractor of three different models is studied to identify the relationship between the entity of gold deposit and the relationship between the entity of gold deposit and the spatial-temporal attribute information.The results show that compared with Attention+Bi LSTM and Transformer,CNN can better extract the relationship between the entity of gold deposit,the relationship between the entity of gold deposit and non-spatiotemporal attributes,and the relationship between the entity of gold deposit and the spatial attributes.The F1 values are 93.64%,88.18%,83.47%,respectively.Attention+Bi LSTM model has the best effect on the relationship between gold ore entities and time attributes,with F1 value of 89.84%.(4)Visual representation and prototype system of knowledge map of gold deposits.Based on the expression model of gold mineral knowledge,and with the help of general knowledge representation of tuple < Node 1,Relation,Node 2 >,an open-source EChars is used to build a knowledge map structure based on semantic network,which can visualize the knowledge map of gold mines.Research and development of a prototype system for spatial-temporal attribute information extraction and knowledge map visualization expression of gold mines,to achieve the functions of query of gold literature data,extraction of gold mine information,evaluation of gold mine information extraction results,and gold mine knowledge base.
Keywords/Search Tags:gold mine literature, space time and attributes, gold ore entity, deep learning, aggregation model, attention mechanism, knowledge map
PDF Full Text Request
Related items