Font Size: a A A

Data Analysis Approaches By Combining Visualization And Data Mining

Posted on:2018-07-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y X MaFull Text:PDF
GTID:1368330548477398Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of big data technologies,data analysis approaches have been deployed in academic and industrial applications.As two representative types of techniques,visualization and data mining are always spotted into limelight for the recent twenty years.For the challenges in the two research fields,data mining is deeply connected with automated data processing algorithms which are non-intuitive and difficult to understand,explore and op-timize.Furthermore,the increasing volume of datasets requires compatible fast data sampling,transformation and visual mapping,layout in most of the typical visualization methods.This thesis is intended to study data analysis approaches by combining visualization and data mining techniques.The core challenge is how the methods from these two fields assist each other,and how they are combined in the knowledge extraction process.For data-mining-based methods,visualization is able to present mining results,enhance user's involvement into the entire mining process and help gaining insights from data mining models.On the other side,many steps in vi-sualization processes can be accelerated and enhanced dramatically with automated data mining techniques,such as visual mapping from data to visual spaces,or pattern recognition in visualiza-tion results.In summary,the main contributes of this thesis are:·A novel visualization mechanism for interactive exploration of community detection results Our approach consists of two stages:interactive discovery of salient context,and iterative context-guided community detection.Center to the analysis process is a context rel-evance model(CRM)that visually characterizes the influence of a given set of contexts on the variation of the detected communities,and discloses the community structure in specific context configurations.The extracted relevance is used to drive an iterative visual reason-ing process,in which the community structures are progressively discovered.We introduce a suite of visual representations to encode the community structures,the context as well as the CRM.In particular,we propose an enhanced parallel coordinates representation to de-pict the context and community structures,which allows for interactive data exploration and community investigation.·A visual analytical approach for transfer learning in classification We present a suite of visual communication and interaction techniques to support the transfer learning process.Furthermore,a pioneering visual-assisted transfer learning methodology is proposed in the context of classification.Our solution includes a visual communication interface that allows for comprehensive exploration of the entire knowledge transfer process and the relevance among tasks.With these techniques and the methodology,the analysts can intuitively choose relevant tasks and data,as well as iteratively incorporating their experience and expertise into the analysis process.We design a visualization system called TransExplorer to implement our approach..A visual analysis approach for open-box support vector machines We design a novel visualization approach for building support vector machines(SVMs)in an open-box manner.Our goal is to improve an analyst's understanding of the SVM modeling process through a suite of visualization techniques that allow users to have full interactive visual control over the entire SVM training process.Our visual exploration tools have been developed to en-able intuitive parameter tuning,training data manipulation,and rule extraction as part of the SVM training process.The entire scheme is encapsulated in our EasySVM system which implements the analysis approaches.·A deep subjective similarity metric for visual analysis of scatterplots Our approach exploits deep neural networks to extract semantic features of scatterplot images for similarity calculation.We create a large labeled dataset consisting of similar and dissimilar images of scatterplots to train the deep neural network.We conduct a set of evaluations including performance experiments and a user study to demonstrate effectiveness and efficiency of our approach.The evaluations confirm that the learned features capture the human perception of scatterplot similarity effectively.We describe two scenarios to show how the metric can be applied in visual analysis applications.
Keywords/Search Tags:Visualization, data mining, community detection, transfer learning, support vector machine, scatterplot, deep learning
PDF Full Text Request
Related items