Font Size: a A A

Research On Classification And Navigation Of LOD Vocabulary Oriented To Dataset Application

Posted on:2022-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:W W MaFull Text:PDF
GTID:2518306509468014Subject:Information Science
Abstract/Summary:PDF Full Text Request
The vocabulary provides a series of authoritative terms for users to describe entity concepts,and at the same time adds a clear semantic relationship to the data,which solves the ambiguity problem of polysemous words and synonyms.linked datasets are datasets published in a linked data format.Each dataset is described by one or more vocabularies to generate interconnections.Therefore,the vocabulary is the core of semantic relations and the basis of linked data.Due to different vocabulary construction methods and different target-oriented objects,there are differences in the languages involved and the areas covered.The use of these vocabularies to describe web resources and search is also different.Which vocabulary users use to associate dataset resources The description also caused a lot of confusion.The themes of the vocabulary are different,the expression form is different,and the focus is different,which causes users to be unable to accurately use the vocabulary to describe the dataset in a fine-grained manner,which increases the user's search burden.Therefore,in view of the poor interoperability between the current vocabulary and the dataset,and the low application range,the vocabulary will be classified through application scenarios to make the classification of the vocabulary more detailed,so that users can quickly and accurately use the vocabulary pair Dataset resources are described and interoperated to improve user retrieval efficiency and vocabulary utilization,making the application of vocabulary more targeted.Based on the application of the vocabulary in the LOD linked data cloud in the data set,this paper deeply discusses the theory and method of the classification system of the vocabulary.It mainly focuses on the following aspects:(1)Related theories of classification system construction and clustering algorithm research.By analyzing the classification system theory and the types of clustering algorithms,select the corresponding algorithms to classify the vocabulary in different dimensions.(2)Feature selection of vocabulary clustering.A detailed analysis of the theme of the dataset,the definition of the vocabulary and the subject characteristics of the vocabulary in the LOD linked data cloud will lay the foundation for the accuracy of the subsequent vocabulary classification results.(3)Empirical research on vocabulary clustering.After a series of unified processing such as data cleaning,the theme of the data set is converted into a document vector,and the document vector is clustered using the keans++ algorithm.A total of 15 data set topics are obtained,and the clustering topic and vocabulary of the data set will be obtained.Perform similarity calculations on the topics of,and then realize the classification of the vocabulary.At the same time,the similarity of different vocabulary classes is calculated,and the hierarchical division of vocabulary classes is realized according to the semantic relationship between the classes.Finally,34 classes are obtained,most of which have 2-3 levels,and the connection is not close enough.(4)The classification results of the vocabulary were designed for the classification and navigation prototype map.The prototype map has two functions,which can be searched and browsed by category.It realizes the classification of different dimensions of the vocabulary and is convenient for users to search and browse.
Keywords/Search Tags:Clustering Algorithm, Classification System, RDF, Linked Data, Text Matching
PDF Full Text Request
Related items