Font Size: a A A

Anomaly detection in data streams with A-Distance: Effects of multiple anomalous operators on accuracy

Posted on:2015-04-16Degree:M.SType:Thesis
University:University of Maryland, Baltimore CountyCandidate:Huang, Shang-LingFull Text:PDF
GTID:2478390017995572Subject:Computer Science
Abstract/Summary:
Recent research has shown positive outcomes in using the A-Distance metric to evaluate the current state of a planning domain to find anomalies with a low false positive rate. In order to use the A-Distance metric, which compares two arbitrary probability distributions, previous research converted the planning domains from a symbolic representation to a vector representation. Each column of the vector represents a predicate in the domain, which means there are as many data streams as there are predicates.;When creating an anomaly, previous work removed a single operator from the planning domain, which causes a shift in the types of problems that can be solved. However, anomalies can affect many operators. In this thesis, an investigation on the effects of multiple anomalous operators on accuracy was conducted. These anomalous operators included adding multiple new operators to the domain under the closed world assumption, deleting multiple existing operators from the domain, and implementing failure rates to the original set of operators in the domain.;Additionally, an exploration of three ways to structure the data streams in the interest of further minimizing the false positive rates was carried out. These methods include taking the sum of all values in a vector, taking the sum of the absolute values of their discrete derivatives, and running a Principle Component Analysis (PCA) on the data streams.
Keywords/Search Tags:Data streams, Operators, A-distance, Domain, Multiple
Related items