Font Size: a A A

Visual Summarization Of Temporal Event Sequences

Posted on:2020-11-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:S N GuoFull Text:PDF
GTID:1368330620952036Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Temporal event sequences record ordered series of discrete events which have oc-curred over a period of time.For example,electronic health records contain sequences of timestamped medical events(e.g.,diagnoses,lab tests,treatment)that occur over the course of the clinical processes of specific patients.Similarly,data capturing behaviors of web visitors can include a series of clickstream data(e.g.,mouse click,page visit)during a web session.With a rapid progress of information digitalization,data from wide areas of appli-cation has been captured in the form of temporal event sequences.Data analysts often wish to extract common patterns of event progression and correlation from massive event sequences.It is challenging,however,due to the high dimensionality(i.e.large number of event types)and temporal dynamics nature of such data.A rich variety of recent research from both data mining and visualization community have been devel-oped to address this challenge.In particular,the analysis-based techniques have focused on either summarizing event sequences with complex event models,or extracting latent stages that occur frequently within a set of sequences.They produce highly summarized results which can highlight interesting high-level structures(e.g.,event correlations,se-quential stages of sequences),but often fail to show important low-level event details(such as the raw,individual event features that contribute to an aggregate summariza-tion)which can help build practitioners'confidence in model performance and make it more easier to derive actionable insights.In contrast,much of the event sequence visualization designs focus on precisely capture details about how individual events occur in sequence over time.This has led to recent methods which focus on prioritization or simplification to enable these ap-proaches to scale to the complexity required for many real-world tasks.Yet even in these cases,the visualized paths of event sequences are closely tied to the low-level representation of individual events or sub-sequences,which makes it difficult to dis-cover or understand higher-level structures within the data.Therefore,there is a gap between the capabilities of existing analysis and visual-ization techniques designed for temporal event sequence data.A desired visual analysis system should be able to discover and communicate latent high-level structures within a complex collection of event sequence data,while at the same time providing users with information about the low-level events and sub-sequences of events which characterize those structures to support semantic interpretation of the findings.This thesis is intended to fill the gap between existing analysis and visualization techniques by leveraging the advantages from both data mining and visualization tech-niques.A series of visual analytics approaches are proposed in this thesis,which aims to extract latent stages from the temporal aspect,produce summarizations for major group of sequences from the event aspect,and detect rare sequences and events in an interpretable and interactive manner.The main contributions of this thesis are:· a novel visual analytics method for identifying semantically meaningful progression stages in a collection of event sequences.This method is driven by unsupervised stage analysis through event representation estimation,event sequence warping and alignment,and sequence segmentation.We also present a novel visualization system,ET2,which interactively illustrates the critical events that helps define a stage and reveal evolution patterns across stages.· a novel technique for analyzing and visualizing latent evolution patterns within large-scale event sequence datasets.This technique includes a novel sequence summarization algorithm that clusters event sequences into threads based on tensor analysis,and an visual interface,Event Thread,that allows interactive exploration and similarity analysis of the threads to derive latent stage categories.· A novel visual comparison technique for detecting interpretable anomalies in event sequence data.We introduce an unsupervised anomaly detection algorithm based on Variational Auto Encoders(VAE).The model learns latent representations for all sequences and can estimate an underlying normal progression for each given sequence represented as occurrence probabilities of events along the sequence progression.We also introduce a visualization system,ET3,to facilitates interpretations of anomalies within the context of normal sequence progressions in the dataset through comprehensive one-to-many sequence comparison.
Keywords/Search Tags:Visual Analytics, Visualization, Event Sequence Data
PDF Full Text Request
Related items