Font Size: a A A

Data Mining Meets HCI: Making Sense of Large Graphs

Posted on:2013-12-03Degree:Ph.DType:Thesis
University:Carnegie Mellon UniversityCandidate:Chau, Duen Horng (Polo)Full Text:PDF
GTID:2458390008480591Subject:Computer Science
Abstract/Summary:
We have entered the age of big data. Massive datasets are now common in science, government and enterprises. Yet, making sense of these data remains a fundamental challenge. Where do we start our analysis? Where to go next? How to visualize our findings?;We answers these questions by bridging Data Mining and Human-Computer Interaction (HCI) to create tools for making sense of graphs with billions of nodes and edges, focusing on:;(1) Attention Routing: we introduce this idea, based on anomaly detection, that automatically draws people's attention to interesting areas of the graph to start their analyses. We present three examples: Polonium unearths malware from 37 billion machine-file relationships; NetProbe fingers bad guys who commit auction fraud.;(2) Mixed-Initiative Sensemaking: we present two examples that combine machine inference and visualization to help users locate next areas of interest: Apolo guides users to explore large graphs by learning from few examples of user interest; Graphite finds interesting subgraphs, based on only fuzzy descriptions drawn graphically.;(3) Scaling Up: we show how to enable interactive analytics of large graphs by leveraging Hadoop, staging of operations, and approximate computation.;This thesis contributes to data mining, HCI, and importantly their intersection, including: interactive systems and algorithms that scale; theories that unify graph mining approaches; and paradigms that overcome fundamental challenges in visual analytics.;Our work is making impact to academia and society: Polonium protects 120 million people worldwide from malware; NetProbe made headlines on CNN, WSJ and USA Today; Pegasus won an open-source software award; Apolo helps DARPA detect insider threats and prevent exfiltration.;We hope our Big Data Mantra "Machine for Attention Routing, Human for Interaction" will inspire more innovations at the crossroad of data mining and HCI.
Keywords/Search Tags:Data, Hci, Making, Large, Graphs
Related items