Font Size: a A A

Probabilistic Fault Localization For Value-Added Services

Posted on:2011-05-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:C ZhangFull Text:PDF
GTID:1118360308962206Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the advances of IN (Intelligent Network),3G/IMS (Third Generation/IP Multimedia Subsystem), NGN (Next Generation Network), the capability of service provisioning of communication networks has been greatly improved, emerging more and more value-added services which pose new challenges for network management and OAM (Operation, Administration and Maintenance). Unavailable services and poor QoS (Quality of Service) make not only the loss of revenue but also degradation of customer loyalty and even the loss of customers. Fault diagnosis is a key technology to ensure high availability, high reliability and quality of service. Fault localization, as a central element of fault diagnosis, determines the efficiency and effectiveness of fault diagnosis to a large extent. The study of fault diagnosis techniques for value-added services, especially fault localization techniques, is really important for both industrial application and academic research.Traditional fault diagnosis focuses on the detection and localization of the faults in devices and networks, which pays attention to the status of devices operations and network connections and fails to consider the relationships, such as causality, the way of impact, the strength of dependency between resources and services. Service faults have much difference from traditional faults:(1) modeling service faults is more difficult. Compared with resource fault modeling in traditional fault management, service fault modeling is more challenging because of its diversity, dynamics, abstractness, dependences, and multi-domain characteristic; (2) the root causes of service failures are more complicated. There are often user reasons arousing the faults besides network, platform, software, etc.; (3) the scope of service faults has been extended. High availability and operations of services make service failures including not only function faults and performance faults, but also support (assistant function) faults and inter-domain faults; (4) non-deterministic status of service fault is usually difficult to recognize. It is often judged the status of service operation by the context and ambient, especially when service quality degrades gradually; and (5) the impacts of faults on users are greater than those of resource faults. The sensitivity of users imposes more challenges on the efficiency and effectiveness of service fault localization.This dissertation takes the emerging value-added service as a research object, aims at reducing the fault localization computational complexity, improving the accuracy of fault detection, shortening the fault localization time, and improving the efficiency and effectiveness of fault localization, and focuses on the key technologies for fault localization of runtime value-added services. This dissertation describes the details of innovations in the research, which are listed as follows:(1) Traditional fault models often lack the relationships between resources and services and do not consider the dependencies, the way of impact, the strength of dependency. Therefore, we propose two modeling approaches:fault modeling based on Statistics and Data Mining (SDM) and fault modeling inspired by overlay network and end-to-end service provisioning (ONEE). ONEE consists of two sub-methods: horizontal fault modeling within service components and vertical fault modeling between service components and resource components. They can make up the gap between value-added service and fault diagnosis system and generate the models for value-added service accurately and quickly.(2) Optimal probabilistic fault localization has been proven to be NP-hard and can hardly be applied to large scale, real-time value-added services. Considering the requirements of probabilistic fault localization for value-added services, we present a heuristic fault localization algorithm called BSD (Bayesian Suspect Degree) based on probabilistic bipartite graph and greedy idea. Different from existing algorithms based on minimum set cover problem, BSD takes a way of valid incremental coverage, which can mitigate the likelihood of false selections of faults. Analysis and simulations demonstrate the efficiency and effectiveness of BSD.(3) Most existing algorithms depend on the symptoms in certain time windows. However, they cannot determine the accurate size of time windows in reality. Usually, improper time windows may decrease the performance of fault localization algorithms obviously. Due to the limit of time windows in OAM practice, we develop an event-driven incremental probabilistic fault diagnosis algorithm called IBSD (Incremental Bayesian Suspect Degree):IBSD can overcome the drawback of inaccurate time windows of fault localization. Simulations show that IBSD outperforms existing IHU (Incremental Hypothesis Update).(4) Although BSD and IBSD are effective even in the presence of slight noise, the algorithms become degradable when facing much noise due to no special consideration for robustness. Thus, based on BSD, we present an algorithm called MICAS (Minimum Interactive Checking with Adaptive Strategy). Through enhanced evaluation function, minimum interactive checking, and setting thresholds adaptively, MICAS obtains an excellent performance of fault localization in the presence of a large amount of lost arid spurious symptoms.(5) Event-driven fault localization algorithms can eliminate the effect of inaccurate symptom observed windows, but the algorithms are inefficient and hard to deal with large amount of concurrent symptoms. What is more, deficient accumulated symptoms often lead to a wrong judgment, which is useless for network operators. We need to consider not only the observed window but also the efficiency. Therefore, we present a fault localization algorithm based on sliding window with preprocessing mechanism (SWPM). Simulation results demonstrate the validity of SWPM.
Keywords/Search Tags:Value-added Services, Network Management, Fault Management, Fault Diagnosis, Fault Propagation Model, Fault Localization, Alarm Correlation
PDF Full Text Request
Related items