Font Size: a A A

Research On Named Entity Recognition And Relation Extraction For Chemical Industry Safety

Posted on:2023-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:L F PengFull Text:PDF
GTID:2531306794990539Subject:Control engineering
Abstract/Summary:PDF Full Text Request
Hazard and operability analysis(HAZOP)is a commonly used hazard analysis method for chemical industry.However,on the one hand,traditional HAZOP analysis is usually conducted in the form of brainstorming by forming a panel of experts,which is more focused but has a costly process and relies too much on expert experience.On the other hand,the data contained in HAZOP reports will lead to an information explosion,and the huge amount of unstructured data is stored in the form of paper-based reports or electronic documents,making it difficult to be reused and shared,which results in underutilization of resources to a certain extent.Thus,it is urgent to use natural language processing(NLP)technology to standardize the information,so as to promote HAZOP analysis to a more automated and intelligent direction.Information extraction in the chemical industry faces the following difficulties:(1)slow development of information extraction in petrochemical field due to lack of available corpora;(2)since there are a large number of polysemy words,proper words,complex nested entity and relation in HAZOP data,knowledge extraction is difficult.To solve the above problems,the HAZOP domain corpus is constructed and the ELMo-DCNN-Bi LSTM-CRF model is constructed for named entity recognition(NER).On this basis,an entity-relation joint extraction model based on parameter sharing is constructed to extract entity-relation triples.The main work is as follows:(1)Analyze the data characteristics of the chemical industry,divide the elements and levels of the data set and define 6 entity categories and 13 relation categories.On this basis,the data is processed,and a fine-grained,small-scale domain corpus is constructed by manual annotation method.(2)Aiming at the problems of polysemy and long-distance text recognition in HAZOP texts,the ELMo pre-trained language model and double-layer convolutional neural network are used to extract richer dynamic features.Construct the ELMo-DCNN-Bi LSTM-CRF deep learning model for NER,and select a suitable activation function for model training.The model in this thesis is verified by experiments,and its F1 value reaches 91.64%.(3)In order to perform entity-relation joint extraction,a joint extraction model is constructed.To improve the recognition accuracy of complex nested entity relationships in HAZOP text,a relation extraction model integrating multi-head attention mechanism is constructed.The parameter sharing method is adopted to construct the entity-relation joint extraction model and realize the feature sharing of ELMo model to strengthen the information interaction between models.Experiments are carried out on the self-labeled corpus,and the results show that the recognition effect of the joint extraction model constructed in thesis is improved compared with other models for HAZOP text entity-relation triples.
Keywords/Search Tags:hazard and operability analysis, named entity recognition, entity relation joint extraction, parameter sharing
PDF Full Text Request
Related items