Distributed Knowledge Discovery for Diverse Dat

Posted on:2018-08-16

Degree:Ph.D

Type:Dissertation

University:The University of New Mexico

Candidate:Hamooni, Hossein

Full Text:PDF

GTID:1478390020457593

Subject:Computer Science

Abstract/Summary:

In the era of new technologies, computer scientists deal with massive data of size hundreds of terabytes. Smart cities, social networks, health care systems, large sensor networks, etc. are constantly generating new data. It is non-trivial to extract knowledge from big datasets because traditional data mining algorithms run impractically on such big datasets. However, distributed systems have come to aid this problem while introducing new challenges in designing scalable algorithms. The transition from traditional algorithms to the ones that can be run on a distributed platform should be done carefully. Researchers should design the modern distributed algorithms based on the problem domain. The main goal of this dissertation is to demonstrate the importance of domain specific knowledge in developing scalable knowledge discovery algorithms on distributed systems. Data properties such as origin, type, context and size play important roles to achieve speed, efficiency and scalability. In this dissertation, I describe three domain specific knowledge discovery systems on three diverse domains: a distributed algorithm to extract patterns from log messages generated by computers, a distributed algorithm to find abnormal behavior in social media, and a scalable algorithm for matching patterns in streaming time series data. I explain how to exploit the data properties in a distributed knowledge discovery system to achieve scalability and speed. The algorithms achieve horizontal scalability for any data size, and the systems are currently deployed at the University of New Mexico.

Keywords/Search Tags:

Data, Knowledge discovery, Distributed, New, Algorithms, Size, Systems

Related items

1	Study On Multi Agent-based Enterprise Distributed Association Rules Discovery
2	Based On Knowledge Discovery Mechanism Of Enterprisedecision Support Systems Research
3	Research On Optimization Of Knowledge Discovery Algorithm For Massive Data Based On Coarse And Fine Grain Size
4	Declarative Languages and Scalable Systems for Graph Analytics and Knowledge Discovery
5	Research On Distributed Knowledge Discovery System Based Mobile Agent
6	Research On The Key Technologies Of Distributed Big Data Consistency Management
7	Knowledge discovery systems for large-scale spatial, time, sequence, and unstructured data
8	Formula Discovery Of Data Mining Algorithms And Improvement Of The Fdd.1
9	Integrating knowledge discovery techniques with prior domain knowledge for better decision support
10	Analysis of institutional data in predicting student retention utilizing knowledge discovery and statistical techniques