Font Size: a A A

Research On The Interoperability Of Person Vocabulary Property Based On Wikidata Property

Posted on:2021-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:X Y CuiFull Text:PDF
GTID:2428330620463157Subject:Information Science
Abstract/Summary:PDF Full Text Request
A vocabulary is a series of authoritative terms carefully selected to describe each entity concept,that is,a collection of phrases and words,which can effectively resolve ambiguities such as synonyms or polysemes.The vocabulary as the semantic center and facilitates information integration and interconnection of alternate data sets.The person vocabulary contains a collection of vocabulary describing the characteristics of the person.Its creation provides industry-recognized terminology for recognizing person entities.Different organizations' descriptions of people have different focuses,different fine-grainedness,and different forms of expression,which make the creation of the person domain vocabulary show complex entity relations,diverse topic types,and broad concept terms.Inevitably,the description data of people in different fields will overlap with each other and cover the same concepts,so the user's use of a concept in the person vocabulary causes confusion.The creation of different vocabularies,while enriching the multi-faceted expression of person entity information,has also increased the burden of user information retrieval.The large-scale semantic knowledge base brings together tens of thousands of related entity data,and its classified and navigational information distribution can meet the personalized needs of users at different levels for various data.It is also the first choice for current users to draw or research data,with extremely high data usage.Therefore,Therefore,by implementing the interoperation between the knowledge base and the vocabulary,the problems of low vocabulary reuse rate and inconvenient user retrieval can be effectively solved,and the user's one-stop information retrieval needs can be effectively realized.At the same time,the data of the knowledge base can be optimized,and its data professionalism can be improved.In addition,using the data classification mode of the large knowledge base on the results of the interoperation mapping,internal analysis of the data can more effectively improve the user's utilization of the vocabulary.Based on the interoperability between Wikidata knowledge base and Linked open vocabulary,this paper takes theoretical research and empirical research based on data research in the field of characters as an example.It mainly focuses on the vocabulary interoperation process from the following aspects:(1)Interoperability related theories.Byanalyzing of interoperation related theories and basic interoperation processes,the corresponding mapping types and methods are proposed.In view of the standardized display of the interoperation mapping results,the related theories of resource description framework are explained.(2)The person data candidate vocabulary is determined.According to the interoperability process,the characteristics of the vocabulary of person data in Wikidata and LOV are analyzed in detail through related theories such as classes and attributes,and the final vocabulary is determined based on the vocabulary screening principles of stability,coverage and relevance,and lay the foundation for the accuracy of subsequent vocabulary matching results.(3)Similarity matching between Wikidata and person vocabulary properties.According to the candidate vocabulary,multiple person vocabulary interoperation models centered on Wikidata are determined.The candidate vocabulary and Wikidata's person dataset were extracted separately and unified through data cleaning.Due to the heterogeneity of multiple vocabularies,multiple property descriptions were selected to match,including property names,aliases,and superproperty etc,and use the similarity algorithm for property alignment.The test results were tested using external vocabulary link relationships in Wikidata.(4)According to the Wikidata property type,the concept term of the property is extracted from multiple perspectives,and property classification is completed to achieve resource integration and one-stop cross-searching of multi-source data.In addition,the RDF(S)/ OWL language is used to transform the property matching data into normalized data,and and the protégé tool is used to visual display.
Keywords/Search Tags:Wikidata, person vocabulary interoperation, property mapping, classification system, RDF data description
PDF Full Text Request
Related items