| With the development of technology and the progress of the times,the Internet plays an increasingly important role among channels of obtaining information,and the amount of data on Internet has increased sharply,including various forms of food safety-related data such as text,pictures,audio,and video.These unstructured data contain a large amount of knowledge in the field of food safety.Traditional manual processing methods are inefficient,time-consuming and labor-intensive,making it difficult to efficiently mine large-scale multimodal data.Automatically extract structured knowledge base from unstructured multi-modal data,realize key information extraction technology based on multi-modal data fusion,and construct knowledge map in the field of food safety,which is of great significance to the development of digital intelligence in the field of food safety.This paper investigates the collection,pre-processing and extraction of key information from multimodal data such as text,video and audio in the field of food safety.The extraction of key information from multimodal data in the field of food safety refers to the extraction of food names,strains of pathogens,pathogenic factors,symptoms and other entities with specific meaning in text,video and audio concerning food safety,as well as the construction of structured triads of relationships between entities.Specific research in this paper is as follows:(1)The web crawler technology is used to obtain multi-modal data such as text and video in the food safety field,pre-process the multimodal data,and build a text corpus from the multimodal data.In order to address the problems that speech data in the field of food safety is difficult to obtain and the performance of the trained acoustic models is not satisfactory,this paper adopts a migration learning approach to acoustic training based on open source speech datasets,and the optimal model training parameters are stored and migrated to the food safety speech dataset for training.The experimental results show that the DFCNN-CTC model incorporating migration learning performs well on speech recognition in the field of food safety.the DFCNNCTC acoustic model is combined with the N-Gram language model to achieve a complete speech recognition process.(2)A fusion adversarial training and joint neural network model is proposed for the joint extraction of entity relations in the food safety domain,which is characterized by a large number of entities and strong specialization.A BERT pre-training model incorporating adversarial training is used to obtain word vector representations,and a Bi-directional Long Short-Term Memory(Bi LSTM)network is used to obtain temporal information of food safety text sequences.The model was also used to extract multiple spatial features and to obtain key character information and inter-character dependencies in the text sequence,the experimental results show that the model has a high accuracy rate in the task of joint extraction of food safety information entity relationships.(3)A knowledge graph-based food safety question and answer system was designed and developed.The extracted <head entity,relationship,tail entity> triad was stored in the graph database Neo4 j to visualize structured food safety data through the construction of the knowledge graph,and a food safety information question and answer system based on the Flask web development framework was designed and developed on the basis of the knowledge graph. |