| Fault management is an important branch of network management. The goal of it is to detect, locate and log network problems in order to fix them timely and minimize the influences. The common deficiencies of current fault management approaches are overhead in networks bandwidth or consumption in CPU time of managed devices, which will doubtless degrade the performance of networks.In this paper, we present an innovative fault management approach based on link state routing protocols. Since every router keeps a database reflecting global network topology and announces the link state changes to all for database updating, we can detect the topology changes of the whole area by monitoring one router in the network. Taking TCP/IP networks with a typical link state routing protocol I-ISIS for example, we set up a fault management model by connecting agent to a router in the network. Here we classify the major faults in networks into four types and also define automic changes of database as events. Firstly, agent acquires the link state database of router and calculates the correlations between all possible faults in current network and the database change events. This is the core of our model; secondly, agent triggers events by monitoring the changes in routers link state database; lastly, agent detects and locates the fault occurred in the networks based on the precalculated correlations of faults and events. Since the agent only connects to one router in the networks and receives packets passively, it consumes no additional network bandwidth. Furthermore, since all analysis and computations are carried on agent, there is no consumption in router's CPU time. Besides, our model makes good use of the fast convergence property of I-ISIS protocol, so that it can detect and locate the fault very quikly.We integrate this fault management model into ROMA (Router Online Management Agent), which is invented by Lucent technologies Bell-labs Reseach China, and apply it to the simulated CERNET (China Education and Research Network) backbone for testing. The results show that this model can detect and locate faults quickly without affecting the networks performance. |