Font Size: a A A

Automatic text classification using a multi-agent framework

Posted on:2007-09-25Degree:Ph.DType:Dissertation
University:Indiana UniversityCandidate:Fu, YueyuFull Text:PDF
GTID:1458390005486575Subject:Information Science
Abstract/Summary:
Automatic text classification is an important operational problem in information systems. Most automatic text classification efforts so far concentrated on developing centralized solutions. However, centralized classification approaches often are limited due to constraints on knowledge and computing resources. To overcome the limitations of centralized approaches, an alternative distributed approach based on a multi-agent framework is proposed. Three major challenges associated with distributed text classification are examined: (1) Coordinating classification activities in a distributed environment, (2) Achieving high quality classification, and (3) Minimizing communication overhead. This study presents solutions to these specific challenges and describes a prototype system implementation. As agent coordination is the key component in conducting multi-agent text classification, two agent coordination protocols, namely blackboard-bidding protocol and adaptive-blackboard protocol, are proposed in the study. To analyze the performance of the distributed approach a comparative evaluation methodology is described, which treats outcome of a centralized approach as baseline performance. A series of experiments was conducted in a simulation environment. The simulation environment permitted manipulation of independent variables such as scalability and coordination strategy, and investigation of the impact on two critical dependent variables, namely efficiency and effectiveness. There were three critical findings. First, in dealing with automatic text classification the multi-agent approach can achieve improved system efficiency while maintaining classification effectiveness comparable to a centralized approach. Second, the agent protocols were effective in coordinating the text classification activities of distributed agents. Third, the application of content-based adaptive learning for acquiring knowledge about the agent community reduced communication cost and improved system efficiency.
Keywords/Search Tags:Text classification, Agent, System
Related items