| The rise of modern industry is inseparable from the deep integration with Internet+,and the current situation of complex and huge data sets in industry has also created information barriers.The digitalization of modern industry urgently needs a unified standard data set as the premise of industry interoperability.In this context,more accurate data extraction is the direction of our efforts to study industrial data sets.Identifying and extracting the key information in the text to build a knowledge graph is the premise of forming visualized,accurate and simplified data.In this thesis,the research on information extraction and knowledge mapping for industrial field is carried out.The specific research work is as follows.(1)Industrial named entity recognition method combining entity enhancement and multi-head attention.An improved entity enhancement method based on dynamic text distance protection mechanism and EDA(Easy Data Augmentation)is proposed.By inputting the semantics of the entity enhanced text and the source text into the embedded layer,a variety of different embedding word feature vectors are fused to increase the noise of the corpus and make the network more generalized.At the same time,the attention training of feature vectors is proposed.By training weights of the dilated convolution output,the weighting result of convolution and matrix is obtained to strengthen the tendency of key features.Finally,through experimental analysis,using the entity enhancement method based on dynamic text distance protection mechanism and entity recognition that integrates entity enhancement and multi-head attention can improve the effect in private industrial dataset,and effectively solve the problem of industrial entity data sparsity.(2)Entity relationship extraction technology based on multi-channel feature fusion.A multi-channel relation extraction model is proposed.By using a variety of pre-training word vectors to provide external input,combining the advantages of recurrent neural network,convolutional neural network and graph convolutional neural network,the relationship between the labeled entities in the sentence is obtained.The experimental results show that three vector feature fusion methods with different characteristics are effective for relation extraction in industrial field.In addition,it is also proposed to add forepart data attention training to each channel to get the weighting result of word vector,so that the model has better tendency to express in the early vector presentation layer.(3)The industrial atlas visualization system based on Neo4 J.This thesis proposes an industrial atlas system,which provides three tuples of information data through text entity recognition and relationship extraction,and obtains the prototype database of the industrial atlas system.The system is oriented to enterprise users.Each user can log in to the system to view and modify the triplet information according to the specific situation,so as to realize the standardized sharing of data.The system enables the entity relationship information to be uniformly labeled in the industry by joining enterprise users,reducing the phenomenon of information jumbled and estranged. |