Font Size: a A A

Research On Problem Of Text Categorization PSE Based On Web Service Composition

Posted on:2009-01-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:J MeiFull Text:PDF
GTID:1118360245999255Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the the amalgamation of web service technology and grid technology, lots of web service-based applications appear in various fields. Among them, the web service-based Problem Solving Environments (PSE) is an emerging technology. It has become a research hotspot and been widely used in the computer application. Text Categorization (TC) can be looked upon as a solution to study the texts' classification. Various classification algorithms related to TC has been studied. However, all those algorithms are lack of the uniform management and have the heterogeneous interface. Furthermore, with the classification precision increased and more and more large-scale text data arisen, the traditional technologies cannot meet the computational resources required for text classification process rapidly.By packaging classification algorithm resources, the web service technology not only provide the unified administration of the resources and the open standard interface, but also support efficient resources accumulation to deal with the classification process. In order to share the classification algorithms and improve the efficiency of research, a web service-based PSE application, Problem Solving Environment for Text Categorization (PSE-TC), is developed. PSE-TC can provide large-scale parallel computation, algorithm comparison and result analysis for the researchers.The main works in this thesis include the following aspects:1. Research on PSE-TC system structure. By using Web Service Resource Frame-work(WSRF) and related PSE application study for reference and contraposing the TC characteristics, the concept of the service platform, integrating the classification algorithm, has been brought forward. Finally, a web service-based four-layer architecture is given, which is the resource provider layer, service integration layer, task execution layer and web portal layer.2. Research on the expanded web service architecture. The web service integration layer considers the Tomcat and Jboss as the application server, which provides the grid resource integration service. Meanwhile, AXIS is used to be the component for publishing the service, and offer the application programming interface suitable for the TC algorithm research. In the entire classification process, the web service is the key technology, which involved in structuring text classifier, classifying text and serving the status monitor.3. Research on the web service security assurance. With the consideration of the web service access control requirement, this thesis describes a lightweight authorization service to solve service access control, Uniform Security Authorization Service (USAS). The USAS divides the users into different levels according to the definite access control policies, builds an authentication and authorization mechanism, and realizes the separation of the user certification management and user role authority. These functions will provide certain reference for the research on service security aspect.4. Research on the workflow based on web service composition. To improve resource utilization and the accuracy of task scheduling, we introduce the concept of domain and domain members. On the basis of hiberarchy and ordering relation among the members, we establish a service workflow model. Based on this model, an optimal service composition algorithm is studied, which can resolve resource conflict problem, pattern ossification problem and the job treatment problem.5. Research on the feedback application of the text classifier. In this thesis, the feedback control learning is applied to modify and rebuild the text classifier. We set the Support Vector Machine (SVM) as an instance to describe the full feedback learning process, building the feedback set by handwork, optimizing and getting rid of the support vector, and rebuilding the classifier. By carrying out the feedback learning, the effective and efficient of the classifier model can be improved greatly with a small quantity of feedback texts.In the end of this thesis, we perform some experiment on the PSE-TC and the related TC application system. By comparing and analyzing the experimental results, the feasibility and validity of the theories and technologies are proved.
Keywords/Search Tags:Problem Solving Environment, Web Service, Text Categorization, Web Service Composition, Feedback Control
PDF Full Text Request
Related items