Font Size: a A A

Research And Application Of Web Front-end Multidimensional Data Visualization Technology

Posted on:2020-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:M WanFull Text:PDF
GTID:2428330596478630Subject:Computer technology
Abstract/Summary:
The rapid rise of computer information technology has produced more and more data in people's social production activities.Among the ever-increasing amounts of data,there is a numerical multidimensional data: high complexity and large amount of information.How to discover the inherent laws of numerical multidimensional data,mining more valuable information,and providing effective help and guidance for subsequent research is an important research topic.Parallel coordinate visualization technology in multidimensional data visualization can directly and comprehensively display the relationship between multidimensional data objects,providing important methods and approaches for data analysisWhen the parallel coordinate visualization technology displays the multidimensional data in a finite plane,because of the crowded space,the lines in the parallel coordinate graph overlap and overlap,which causes visual confusion and data law is difficult to find.The combination of dimensionality reduction technology,clustering technology and parallel coordinate visualization technology can solve this problem well.The thesis combines these three technologies to conduct research and develop a visualization platform.The main work has the following three aspects:First,based on t-SNE,an improved t-SNE algorithm is proposed.In the improved algorithm,the weighted Euclidean distance is used to measure the similarity between sample points in high-dimensional space,and more accurate sample point similarity is obtained.For the loss function in the algorithm,add L2 regularization as a penalty term,limit the over-fitting of the objective function,and obtain the best objective function after multiple iterations.The improved algorithm and other dimensionality reduction algorithms were simulated with the Wine dataset in the UCI(University of California Irvine)database provided by the University of California,Irvine to verify the dimensionality reduction of the improved algorithm.Using the improved t-SNE algorithm to reduce the dimensionality of the data set,and then use parallel coordinate visualization technology to show,by comparing with the visual graphics experiment without dimensionality reduction,it proves that the proposed method can improve the poor effect of visualization of parallel coordinates.Secondly,based on K-means,an improved K-means algorithm is proposed.In the improved algorithm,the Canopy algorithm is used to "coarse" cluster to generate k Canopy points as the initial clustering center of K-means;the weighted Euclidean distance is used to divide the clusters of data points to avoid the problem of "distance distortion" in the traditional Euclidean distance.The improved algorithm and other clustering algorithms were simulated with the Iris dataset in the UCI database provided by the University of California,Irvine to verify the clustering effect of the improved algorithm.Experiment use the improved K-means algorithm to process the reduced dimensional data set,and then displayed by parallel coordinate visualization technology,by comparing with the visual graph experiment that are not clustered,,it proves that this method can discover the connection between data objects more intuitively.Finally,based on the big data platform traced back to the tea quality and safety visualization group in the Wulingshan area,which was participated during the postgraduate period,a multidimensional data visualization platform was developed using web front-end technology,combined with the agricultural product production data set provided by the National Bureau of Statistics,verifies that the platform is not only suitable for low-dimensional data visualization,but also for multi-dimensional data visualization.The improved dimensionality reduction algorithm and improved clustering algorithm combined with parallel coordinate visualization technology can make the visualization better and make data analysis more convenient.
Keywords/Search Tags:Multi-dimensional data visualization, parallel coordinate visualization, t-SNE Algorithm, K-means Algorithm
Related items