Design And Implementation Of Large Data Visualization System Based On Resolution Adaptive Sampling

Posted on:2021-01-10

Degree:Master

Type:Thesis

Country:China

Candidate:B Tang

Full Text:PDF

GTID:2428330647950860

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Nowadays,data visualization technology is used in the field of artificial intelligence and data science widely.Graphics have the one of the features that they are easy to be observed,so researchers take advantage of this feature to find data links in the original data which are not easy to be detected.However,in the current application of AI data analysis,data often has the characteristics of large data volume and high dimension,due to the limited resolution of the objective screen,it is almost impossible to render the high-dimensional big data completely on the screen,so it is difficult to implement visualization on the general hardware system with low configuration.In this thesis,by adapting the resolution of the current screen,according to the data scale that can realize visualization under the current resolution,we sample the original data with large amount of data and high dimension,and render the sampling results on the screen.Based on the above ideas,a big data visualization system is designed in this thesis.Users can observe the internal rules and data characteristics of the original data before training the artificial intelligence model by using the visualization results processed by the system,so as to predict the availability of the training data.This thesis aims to design and implement a big data visualization system,and discusses many key technical issues.The main work includes:(1)A sampling algorithm which keeps the relative subspace is used to reduce the large amount of original data to adapt the range of current resolution.For the current resolution,the relative data proportion of any subspace in the visualization graph is maintained,so that the data density in the space is consistent with the original data.(2)A dimension reduction algorithm which sticks tthe neighborhood extremum is used to reduce the high-dimensional data to an acceptable dimension range under the current resolution.For current resolution,the neighborhood of any sampling point in the visualization graph is stuck with the extreme value of the graph boundary,so that the visual effect of the visualization is consistent with the original data in the changing trend aspect.(3)Through the pre-processing and rendering module of the big data visualization system designed and developed in this thesis,the big data visualization of the resolution adaptive sampling can be effectively implemented on the general hardware system with low configuration.For multiple datasets of different sizes(including the dataset Nanopore Reference Human Genome with the size of the 2.5TB),by observing the renderings after it runs in the big data visualization system with radar image as the display carrier,we can clearly find the commonness of the same category and the difference among different categories.Therefore,before the artificial intelligence model training,users can use this system to effectively predict the availability of training data and provide a basis for subsequent work to obtain more accurate conclusions.

Keywords/Search Tags:

Resolution Adaptation, Sampling Algorithm, Dimension Reduction Algorithm, Big Data Visualization System

PDF Full Text Request

Related items

1	Research On The Dimension Reduction And Visualization Platform Of Brain Network State Observation Matrix Based On T-SNE Algorithm
2	Exploration Of Dimensionality Reduction For High-dimensional Data Visualization And Its Application In Biomedicine
3	Dimension reduction algorithms in data mining, with applications
4	Research On Methods Of Complex Simulation Data Dimension Reduction And Visualization Clustering
5	Research And Implementation Of High-Dimensional Data Visualization Based On Dimesnion Reduction Techniques
6	Research On Multi-dimensional Data Visualization Methods
7	The Study And Implementation Of High Dimensional Data Visualization Platform Based On Nonlinear Dimensionality Reduction Methodson
8	The Research Of High-dimensional Data Dimension Reduction Technology Based On Shuffled Frog Leaping Algorithm
9	Dimension Reduction Under Complex Data Environment
10	Three-dimension Visualization Of Watershed Based On Multi-resolution LOD