Font Size: a A A

Building blocks for composable Web services

Posted on:2004-03-19Degree:Ph.DType:Dissertation
University:Georgia Institute of TechnologyCandidate:Buttler, DavidFull Text:PDF
GTID:1468390011968302Subject:Computer Science
Abstract/Summary:PDF Full Text Request
We present three basic building-block technologies for a wide range of Web services: a methodology for efficient and automatic object extraction that provide significant improvements over existing efforts, a comprehensive approach to service selection that combines highly accurate schema-based selection with generic content-based selection, and a scalable Page Digest-enhanced change detection framework and efficient sentinel processing algorithms that offer more than an order of magnitude improvement.; Object extraction is an essential component of many data intensive Web services that use content generated from deep Web databases. These pages are designed for user browsing, making them difficult for machines to integrate their data into composite services. Automated techniques to reliably extract this data as page design change are crucial. We present the Omini methodology for a fully automated object extraction system for dynamically generated Web pages, consisting of a layered approach to identify data regions in pages, and extract individual objects. We evaluated Omini using more than 3,200 pages over more than 100 diverse Web sites, achieving a 96% success rate for minimal object rich region and 95% success rate for discovering object boundaries. Our algorithms are fast, and the overall system achieves between 95% and 96% precision, and between 96% and 100% recall.; Tracking and detecting changes on the Web has become a fundamental type of service. A key objective of our change detection research is focusing on scalability techniques for Web change monitoring systems which are designed to enable the handling of millions of information monitoring requests. We describe three components: a new class sentinels for monitoring changes in the content and structure of a page, as well as measuring the percentage of change; mechanisms for computing document and structure similarity; and sentinel grouping techniques based on the Page Digest encoding for effectively eliminating redundant processing and unnecessary network communication. We show an order of magnitude speed up over previous efforts and provide mechanisms to further scale the system.; We introduce the notation and issues of Web service routing, and present a practical solution for designing a scalable Web service selection system based on multi-level progressive pruning strategies. The key idea is to create and maintain service-capability profiles independently, and to provide algorithms that can dynamically discover relevant services for a given query based on the contents provided and through the use of service and user profiles. Our approach offers precise interest matching for handling queries with complex conditions.
Keywords/Search Tags:Service, Web, Object
PDF Full Text Request
Related items