Font Size: a A A

Detection Of Application-Level Failures In Large-Scale Internet Service

Posted on:2011-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:L WuFull Text:PDF
GTID:2178360308461268Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet Service and technology, the number of large-scale Internet Services has grown significantly in recent years. However, in the mean time, web applications are becoming increasingly complex and hard to manage. Due to multi-level structure, Internet Service must be affected by the failures form different levels. Particularly, the non-fail-stop application-level failures have a first-order impact on the user's experience that may result in temporary or permanent site abandonment. To ensure the QoS in Internet Services, it is critical to detect the application-level failures quickly and accurately. The existing detection methods always have the following limits:1) Relay on the inner structure of managed system greatly2) Computing complexity is too highIn this paper, we propose a general approach to detect all kinds of failures which might occur in the application-level after analyzing the characteristic of Internet Service. The main contributions of this paper are: 1) At the very beginning, we summarize the layer-structure of Internet service and the challenge faced in each layer. Then we have a deep analysis on the category and characteristic of application-level failures.2) We describe a novel method for runtime problem detection based on metrics of "black-box" user behavior, which requires neither additional system instrumentation nor prior input from the operator. 3) We apply a classical algorithm to decompose the high dimensional training data into two subspaces, which not only forms the baseline of the following detection, but also decrease the computing complexity. 4) We evaluate our method by applying it to a J2EE application Pet store and compare it with an existing method.The comparison shows that our method can find out all failure with just a few false alarms. This is very meaningful for the following failure localization.
Keywords/Search Tags:Internet Service, Application-level failure detection, User behavior, Subspace decomposition
PDF Full Text Request
Related items