Research On Knowledge Extraction Method For Wikipedia Multimodal Data

Posted on:2020-07-15

Degree:Master

Type:Thesis

Country:China

Candidate:C J Xie

Full Text:PDF

GTID:2428330572457130

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In the constructing of knowledge graph in metal materials field,the mining of text data or structured data is paid more attention,while the mining of multimedia data is often ignored.In some special question and answer requirements(such as visual Q&A),these knowledge graphs can play a limited role.This paper designs a method for extracting knowledge of metal materials from Wikipedia multimodal data and constructing a multimodal knowledge graph.Inspired by IMGpedia,the image data is further explored based on text mining.Different from IMGpedia,the image feature classification method based on deep neural network model is introduced to extract the vision tags of the images.Besides,by using the text description as the context of the image,the rich textual entity tags associated with the images are obtained using the entity annotation system.At the same time,based on the hierarchical relationships in WordNet and DBpedia,a data fusion method based on topology structure is designed.The knowledge extracted from image vision and text description is fused to construct a lightweight multi-modal knowledge graph.This topic has done the following research works:1)An image vision tags generation strategy based on deep neural network model is designed.The image visual features are mined and classified using VGG-Net as image-bound vision tags,rather than just a simple visual descriptor of the image.In addition,a new evaluation criteria are designed to evaluate the results of image visual content processing.Different from the evaluation criteria of image classification tasks,the new evaluation criteria are based on the visual content of the images to calculate the rationality of the tags.This paper defines it as satisfaction.2)Based on the image descriptive text and DBpedia,the entities of the text semantic representation of the image are obtained.The descriptive text of the image is used as a context for extending the semantic content of the image association and mined using DBpedia-Spotlight,and the obtained text entity is regarded as a resource associated with the image.Furthermore,a new strategy is designed to set parameters for DBpedia-Spotlight to better serve the data in this paper.3)Topology-based multimodal data integration strategy and method are designed for constructing a lightweight multimodal knowledge graph.Based on the hierarchical structure in WordNet and DBpedia,the extended strategy and the concept screening strategy based on effective concepts are designed.A set of relationship properties for connecting images to text,text to text are defined.Finally,the relevant processes and results are demonstrated by a prototype system.

Keywords/Search Tags:

Knowledge graph, IMGpedia, WordNet, DBpedia, VGG-Net, Metallic Materials, DBpedia-Spotlight

PDF Full Text Request

Related items

1	Design And Implementation Of Material Knowledge Extraction System Based On DBpedia
2	A Study On Efficient Named Entity Recognition Approach Based On DBpedia Spotlight
3	Research On Entity Linking Technology Based On DBpedia Knowledge Base
4	Plot Planning Of Animation Based On DBpedia
5	Research Of Ontology Population Based On Linked Open Data
6	Research On Open Knowledge Graph-based Question Answering System For Metallic Materials Domain
7	Research On Graph Based Named Entity Disambiguation
8	Research On The Construction Method Of Mongolian Knowledge Graph Based On WordNet
9	Research On Semantic-oriented Data Organization And Management Model In Dataspace
10	Sentiment Analysis Based On Wikipedia Articles