Font Size: a A A

Design And Application Of Multi-Modal-Oriented Event Analysis Algorithm

Posted on:2023-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:S B ChenFull Text:PDF
GTID:2558307073491324Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Social media network event analysis is one of the basic tasks of social platforms.Generally,the identified event categories and other information are applied to user portrait construction,personalized recommendation,platform information classification,knowledge graph construction,etc.to promote platform development and improve user experience.Since social platforms are flooded with various users` information,the occurrence of social events is also random,manual detection methods or traditional rules cannot quickly locate and analyze.Deep learning algorithms are usually used to quickly analyze user-posted information and widely discussed events.This thesis is devoted to the research on the key technologies of event classification in the face of multimodal data in social scenarios.Using the real data in social media,realize the event analysis task by the classification method,and construct the multimodal event classification model.At the same time,the mature deep learning model optimization scheme is applied to improve the performance of the subject model.In the end,this topic can quickly analyze multi-modal data information in social networks,master and analyze the public opinion information published on the platform.Firstly,the process of multi-modal event analysis taskoriented processing is analyzed.According to the processing process,the task is disassembled into single-modal data analysis and preprocessing,multi-modal joint representation network and classification task network design,and the loss function from the model.The performance of the model is optimized in terms of stickiness,generalization,etc.The dataset in this thesis comes from graphic data collected from social networks,which contains redundant information and meaningless symbols.With the goal of improving data quality and unifying the data form,a preprocessing process is designed for single-modal data.The image data is preprocessed by the target detection model based on the Faster R-CNN network,to retain the target object representation sequence and its coordinate information containing semantic information in the picture.Text preprocessing designs a cleaning process based on a text perplexity model,improves text quality and uses dictionary mapping.The dataset in this thesis comes from the graphic data of social networks.The data of each modality is analyzed and preprocessed.The image data is preprocessed by the object detection model based on the Faster R-CNN network,and the image information is disassembled into sequential representations and their location;the text preprocessing is mainly based on the text perplexity model to design the cleaning process to improve the text quality.Based on the multimodal single-stream or double-stream representation method,the multi-modal task baseline model is selected and designed,the precision and recall of the two baseline models are measured on the subject data set,and the single-stream structure with an F1-Score of 64% is selected.as a baseline model for the tuning phase.In order to improve the performance of the baseline model,referring to the optimization in the computer data task and natural language process and the optimization of the classification task,three optimizations’ schemes are formulated for the subject data set and model.Use the Focal Loss instead to optimize the problem of unbalance distribution of samples,so that the model training gradient can fall faster and more stably.Based on adversarial learning,perturbation is added to the embedding of text and images,which improves the accuracy of the model by 2% on the validation set.The idea of contrastive learning in natural language processing is applied to improve the representations’ ability of multimodal models,and the accuracy of the model is increased by 2%.Combining all the optimizations,the F1-Score of the optimized model is improved by 11% compared with the baseline model,reaching the final 75%,which verifies the benefits of the optimization strategy.
Keywords/Search Tags:Multimodality Representation, Deep Learning, Classification Neural Network, Adversarial Learning, Contrastive Learning
PDF Full Text Request
Related items