Font Size: a A A

High-dimensional Data Analysis Based On Scatter Plot Classification

Posted on:2020-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:X H LiuFull Text:PDF
GTID:2518306518966949Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the widespread use of network devices such as computers,the amount of data has proliferated,and the analysis of large data volumes has become an important part of data analysis.High-dimensional data analysis is an important part of big data analysis.Most of the high-dimensional data analysis systems are based on some systems to do some data filtering and dimensional reduction operations.Their view display generally displays information in the form of quantity,such as scatter plot matrix,parallel coordinate axes and so on.The cognitive burden of these methods on users is still relatively large.For high-dimensional system analysis,our paper proposes a method of scatter plot mode for dimension analysis,combined with data filtering for high-dimensional data analysis.First,we found that the scatter plot has a distinct topology.Based on this finding,we use the commonly used classification model to train the scatter plot,and select the better performing MLPClassifier as the classification model.After obtaining the trained model,the new data set is trained and marked.Then,use this tag information to analyze the dimensions.Finally,we combine with data filtering to analysis highdimensional data.We have designed a visualization system.The system consists of a scatter plot,a dimension hierarchy cluster view,a data domain view,and a scatter plot matrix in the form of a heat map.Dimensional scatter view is used to intuitively display the distance,grouping and other information between dimensions.Dimension hierarchical cluster view is used to provide reference to users for clustering.Data domain view is used to show topology and change between data.And the dot matrix is used to verify the rationality of the dimension grouping.Advanced interactive operations such as dragging and box selection are also provided to filter data.Finally,the paper uses three user cases to verify the rationality and practicability of this method and system,and helps users to perform cluster analysis and understanding high-dimensional data.
Keywords/Search Tags:High Dimensional Data, Visualization System, Dimension Reduction, Subspace
PDF Full Text Request
Related items