Font Size: a A A

Research And Implementation Of Semantic Similarity Parallel Calculation Based On Association Of Tourism Data

Posted on:2016-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:X JinFull Text:PDF
GTID:2308330461450875Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the advent of the information age, data on the network showed an exponential growth. Because of the characteristics of autonomy, heterogeneity and distribution of the data, information island phenomenon appears. The ability to reuse and share of data is greatly decreased. The appearance of linked data has solved this problem. Tourism is closely linked with people’s daily life, and the development level of tourism information is a measure of an important symbol of modern tourism industry. Tourist information systems often include the information of reviews, pictures, videos, but also the surrounding information of accommodation, meals, transportation, shopping, entertainment. Using web technologies, past travel sites is often a combination of a lot of html. Website can not be understood by the computer, and people can not communicate well through it. Since the development of the semantic web, computers become more intelligent. Most resources have open nature. Through the collection, processing and utilization of travel services information, and linked to other data, modern tourism can eliminate data silos, come to the sharing of cross-regional, cross-industry, cross-department resource, and maximize the value of tourism information resources.Building ontology is the base of creating linked data. After a review of national related standards, combined with the actual tourism data and the result from multiple interactions of experts, tourism ontology and concept are building and describing by defining classes, defining properties, creating instances and realizing ontology.The key of associated data’s semantic founding is to calculate the semantic similarity, and this paper mainly studies the computational efficiency of semantic similarity in the linked data set of tourism resources. According to tourism ontology, Jena is used to convert OWL file form into RDF triples form and resolves the RDF triples. Based on the three methods of existing RDF similarity calculation and parallel computing framework of Map Reduce, parallel computing method of semantic similarity of linked data is designed. The efficiency of large-scale linked data’s semantic founding can be improved.At last, parsed RDF data in the new established tourism ontology is experimented on the clusters of Hadoop platform. Experimental results show that compared with realization of the traditional platform, the similarity algorithm of parallelization implemented on Hadoop clusters increases capacity and efficiency of its massive data processing, and has good speedup and scalability. The built tourism ontology also increases the data value of sharing and reuse, and can be more widely service for the users.
Keywords/Search Tags:Tourism ontology, Linked data, Semantic similarity, JENA, Parallel computing, MapReduce
PDF Full Text Request
Related items