Font Size: a A A

Design And Implementation Of Large Data Visualization System Based On Resolution Adaptive Sampling

Posted on:2021-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:B TangFull Text:PDF
GTID:2428330647950860Subject:Engineering
Abstract/Summary:PDF Full Text Request
Nowadays,data visualization technology is used in the field of artificial intelligence and data science widely.Graphics have the one of the features that they are easy to be observed,so researchers take advantage of this feature to find data links in the original data which are not easy to be detected.However,in the current application of AI data analysis,data often has the characteristics of large data volume and high dimension,due to the limited resolution of the objective screen,it is almost impossible to render the high-dimensional big data completely on the screen,so it is difficult to implement visualization on the general hardware system with low configuration.In this thesis,by adapting the resolution of the current screen,according to the data scale that can realize visualization under the current resolution,we sample the original data with large amount of data and high dimension,and render the sampling results on the screen.Based on the above ideas,a big data visualization system is designed in this thesis.Users can observe the internal rules and data characteristics of the original data before training the artificial intelligence model by using the visualization results processed by the system,so as to predict the availability of the training data.This thesis aims to design and implement a big data visualization system,and discusses many key technical issues.The main work includes:(1)A sampling algorithm which keeps the relative subspace is used to reduce the large amount of original data to adapt the range of current resolution.For the current resolution,the relative data proportion of any subspace in the visualization graph is maintained,so that the data density in the space is consistent with the original data.(2)A dimension reduction algorithm which sticks tthe neighborhood extremum is used to reduce the high-dimensional data to an acceptable dimension range under the current resolution.For current resolution,the neighborhood of any sampling point in the visualization graph is stuck with the extreme value of the graph boundary,so that the visual effect of the visualization is consistent with the original data in the changing trend aspect.(3)Through the pre-processing and rendering module of the big data visualization system designed and developed in this thesis,the big data visualization of the resolution adaptive sampling can be effectively implemented on the general hardware system with low configuration.For multiple datasets of different sizes(including the dataset Nanopore Reference Human Genome with the size of the 2.5TB),by observing the renderings after it runs in the big data visualization system with radar image as the display carrier,we can clearly find the commonness of the same category and the difference among different categories.Therefore,before the artificial intelligence model training,users can use this system to effectively predict the availability of training data and provide a basis for subsequent work to obtain more accurate conclusions.
Keywords/Search Tags:Resolution Adaptation, Sampling Algorithm, Dimension Reduction Algorithm, Big Data Visualization System
PDF Full Text Request
Related items