Font Size: a A A

Process Named Entity Recognize Method Based On Deep Learning

Posted on:2022-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:Q S JiFull Text:PDF
GTID:2492306572980699Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
The Manufacturing Execution System(MES)is an important function of the intelligent workshop.Through the MES,the process information can be transferred from the design process to the manufacturing process.However,the process information input to the MES is an unstructured document,which brings difficulties to automated information identification.Therefore,the automatic identification of process information is the key to transform unstructured data in MES into structured data.The named entity recognition method based on deep learning shows great potential in structured information extraction task.But,the finegrained,small sample,and multi-modal characteristics of process documents in real industrial scenes have brought obstacles to the modeling and implementation of deep learning named entity recognition methods.To this end,this paper focuses on fine-grained tasks and small samples,and proposes a named entity recognition method based on deep learning approach,which is verified in engineering cases.Firstly,for the fine-grained named entity recognition problem,a deep learning model(BFB-attention)based on the attention mechanism is proposed.This method is based on the semantic features output by the BERT pre-training model,adding fine-grained features and entity boundary features.The fine-grained feature is the word vector expansion dimension designed based on the prior label distribution,and the entity boundary feature is the smooth boundary texture word vector designed according to the distance of the entity boundary.The BERT semantic features,fine-grained features,and entity boundary features are merged through the character-level attention mechanism.This method is verified on CLUENER,a fine-grained named entity recognition data set,and shows higher recognition effect of other named entity recognition methods.Secondly,for the small-sample named entity recognition problem,a data augmentation method of "slice divide and conquer" is proposed based on the BFB-attention model.This method can realize the label sequence alignment of the augmented data.Among them,the EDA data augmentation method is used for the non-entity sequence,and the non-entity sequence is unconditionally text augmentation based on the word vector similarity;the dictionary augmentation method is used to the entity sequence,and then we use the Simhash Algorithm to quickly remove similar sequences.This method has a better recognition effect than the original method in the CLUENER and MSRA Chinese named entity recognition datasets of small samples.Then,for the process quality information extraction case in an aircraft processing department,a complete set of processes from case analysis to data preprocessing to algorithm application and system development was realized.Based on the characteristics of multi-modal data,pre-processing methods such as data cleaning and modal normalization are designed,and the named entity recognition framework of BFB-attention model and "slice divide-andconquer" data augmentation is applied to the named entity recognition of process quality information.In addition,we use real process data to verify the identification performance of the above method.Experimental results show that our model can obtain an F1 score of 0.937 in the process specification data,which can extract process quality information more accurately than other methods.Through the development of a process quality information extraction system,the document preprocessing and recognition model is encapsulated,thereby realizing the structured extraction of quality information in process documents.Finally,the main work of the full text is summarized,and the direction of further research is prospected.
Keywords/Search Tags:process documents, deep learning, named entity recognition, fine-grained, small samples, unstructured data
PDF Full Text Request
Related items