Font Size: a A A

Research And Implementation Of Multimodal Named Entity Recognition Based On Deep Learning

Posted on:2023-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:W W CenFull Text:PDF
GTID:2558306914463724Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Named entity recognition task is an important part of information extraction technology,and its model quality is directly related to downstream tasks such as relation extraction,and plays an indispensable role in modern applications such as intelligent dialogue.In the era of big data,the diversification of data forms provides more possibilities for the development of this task.The named entity recognition task that integrates multi-modal data such as images,obtains multi-dimensional information,and has better performance in recognizing named entities in text,thus producing important research value.At present,multimodal named entity recognition is mainly based on deep learning technology.How to fully mine the information of each modality and how to more effectively comprehensively process the information of multiple modalities are the main research content and puzzles in the multimodal named entity recognition task.Specifically,the current named entity recognition based on sequence annotation cannot effectively utilize the semantic information of entity categories,the image feature representation method is single,and because of the lack of interaction between different modals the information cannot be effectively integrated.Based on this,the thesis studies the text feature extraction algorithm based on machine reading comprehension question answering model,the image feature extraction algorithm based on target detection and the multimodal feature fusion method based on cross-modal attention to explore the existing problems mentioned above.The specific content and results are as follows:(1)A text feature extraction algorithm for machine reading comprehension question and answer is proposed.Compared with the existing named entity recognition based on sequence annotation,this algorithm can integrate the prior semantic information of entity categories to obtain richer context representation of texts.The effectiveness of the method is demonstrated by the significant improvement of named entity recognition performance in the experiment.In addition,a method of fusing the target detection information of images is proposed to make more effective use of image key information.(2)A multi-modal feature fusion method based on cross-modal attention is proposed,which is based on the Transformer structure and fuses text and image features to make the model focus on the more useful parts of each modality.Compared with the simple fusion in previous multimodal tasks,the interaction between modalities in this method is sufficient and the mutual connection is strong,which can make more effective use of multimodal information.(3)A prototype system for multimodal named entity recognition based on deep learning is implemented.In this paper,the front-end page and back-end system are developed based on Web,which provides users with real-time input function and result visualization display function.At the same time,the system also integrates functions such as crawler,entity linking,entity relationship extraction and relationship graph display.
Keywords/Search Tags:multimodal, named entity recognition, machine reading comprehension, cross-modal attention
PDF Full Text Request
Related items