Font Size: a A A

Research On Distributed Semantic Search For Resources In Distributed Network

Posted on:2015-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y HuFull Text:PDF
GTID:2298330467463184Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Resource search is a key technology in resource management, and the basic requirement for resource search is returning the resources, which should meet certain constraints, to users quickly and accurately. However, with the rapid development of network technology and the popularization of network applications, resources in network expand rapidly and have much semantic information, which makes the research on resource search have the trend of distributed search and semantic search. So, the research on distributed semantic search for resources based on P2P network becomes a hot topic. Meanwhile, the feature of high-dimension of resources is a problem which can’t be ignored in distributed semantic search, and it can cause the performance degradation in resource search which is the focal point and difficulty in research.In order to achieve distributed semantic similarity search for high-dimensional resources, the construction of low-dimensional semantic indexing of high-dimensional resources is the basic idea of this paper. To this end, the algorithm of principal component analysis (PCA) is introduced in this paper, in order to extract the latent semantic principal components of high-dimensional resources. PCA projects high-dimensional resources into low-dimensional space, to build low-dimensional semantic indexing. In this case, the semantic similarity information is contained in the indexing and the high-dimension is reduced. So, this low-dimensional semantic indexing can effectively support distributed semantic similarity search.Because building resource indexing with traditional PCA needs a centralized environment of network, the computation of traditional PCA can’t adapt to the distributed P2P network. So, this paper creatively proposes four distributed solutions for principal component analysis, which are semi-distributed principal component analysis, hierarchical principal component analysis, fully distributed principal component analysis, and cluster-data principal component analysis. These four solutions improve the adaptability of PCA in distributed P2P network in different terms. The feasibility and effectiveness of the four solutions are proved by theoretical analysis and simulation. In this way, low-dimensional indices of high-dimensional resources can be built effectively based on principal component analysis in distributed network.With the building low-dimensional semantic indexing based on improved PCA, this paper achieves distributed semantic similarity search for high-dimensional resources in CAN (content addressable network), and the analysis and simulation results show that the distributed semantic search can obtain a high recall and precision ratio.
Keywords/Search Tags:semantic similarity search, P2P network, low-dimensional indexing, principal component analysis, the feature ofhigh-dimension
PDF Full Text Request
Related items