Font Size: a A A

Research And Implementation Of Federated Retrieval Platform Based On Service Oriented Architecture

Posted on:2015-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:H C LiFull Text:PDF
GTID:2298330422490887Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years, the rapid development and popularization of computertechnology has changed the enterprise information management model,Multi-sectoral and trans-regional enterprises are constantly emerging. Thetraditional centralized data management model and peer-to-peer data interactionmodel not only can’t meet the needs of enterprise information management andsharing, but also its business architecture can’t meet the needs of the dynamicbusiness expansion. In addition, the isolation between departments often make usershardly have access to information of all departments efficiently and timely.Getting resources and information of multiple distributed heterogeneousdatabase at one-time has become the purpose of this study. Based on this demand,this article builds a federated information retrieval platform oriented on servicearchitecture in the VMware enterprise private cloud environment. In the realizationof the original information management and sharing, using VMware Iaas cloudcomputing service also improve the utilization rate of software and hardware, datesecurity and the quality of service.First of all, this paper introduces the concept of SOA and uses the distributedWeb service technology to implements the flexible and loosely coupled SOAarchitecture, meeting the needs of the dynamic expansion of enterprise business.And by introducing the concept of metadata, we design the uniform metadatastandards for distributed heterogeneous and unstructured data to facilitate thecentralized management of data resources in a single resource center. Thestandardized design of metadata makes the distributed Web search services have thesame interface specification. For the federated search results of multiple data centers,after researching of various sorting algorithms and testing its recall ratio and sortingefficiency with the TREC testing set, we design the adaptive synthesis sortingalgorithm which is suitable for the federated information retrieval platform. Inaddition, combined with that the resource utilization ratio of server and the virtualmachines on it can be monitored in real time, we design the feature of loadbalancing based on VMware cloud platform in order to improve running stability ofthe platform under high load. Finally, we design and implement the feature ofsemantic conflict mediation to improve the recall rate in information retrieval, at thesame time by using the abstract data management model of dataspaces and buildingthe relationship between objects, we make the users have a more comprehensiveunderstanding of the relevant information through correlation search. At last, inorder to identify bottlenecks and deficiencies of the platform, this paper constructs the LoadRunner cluster to test the performance of the platform from a plurality ofangles and analyses the testing results.
Keywords/Search Tags:Federated Search, Cloud Platform, Service Oriented Architecture, Merging Results
PDF Full Text Request
Related items