Font Size: a A A

Research On Dimensionality Reduction Visualization Method Of High Dimensional Biological Data Based On Gradient Descent And Adaptive Learning

Posted on:2019-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:X C ZhuFull Text:PDF
GTID:2428330548472428Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
High-dimensional data visualization is a very challenging problem in the field of data mining.Traditional high-dimensional data visualization methods map high-dimensional spatial data points to low-dimensional space through dimension reduction methods,but they are limited by the nature of the metric space,making it difficult to truly represent similarity data in non-metric spaces because of these similarities are often non-transitive and they share different potential similarities.For example,from the perspective of disease phenotype,many complex diseases are a special combination of multiple symptoms.These disease-related symptoms often reveal a common disease mechanism and physiological mechanism,but it is clear that the relationship between disease phenotypes is non-delivery.Sexual,so the overlap of disease phenotypes makes it difficult to accurately diagnose certain types of complex diseases.Efficient data visualization methods can help to discover the potential pattern features and data distribution characteristics of different dimensions in complex high-dimensional data,and help researchers to eliminate the impact of ambiguity between high-dimensional data and make accurate diagnosis and prediction.The main research work of this article is as follows:1)Regularized high-dimensional data visualization method of mm-tSNE based on Nesterov momentum.mm-tSNE can visualize data points in high-dimensional space to low-dimensional space.During the update of the objective function,the direction of each parameter update depends not only on the current gradient position but also on the direction of the last parameter update.The update direction of the gradient method based on the standard momentum is not necessarily accurate,and the step size is too large and causes excessive oscillation of the algorithm.The mm-tSNE regularization based Nesterov momentum method calculates the gradient of each iteration of the parameter by first applying Nesterov momentum to a new temporary parameter position,and then calculating the updated weight gradient at the current parameter position to provide a larger and more timely gradient correction,this improvement allows the Nesterov momentum method to update gradients in a faster and more responsive manner,and is faster and more stable in convergence than the standard momentum method.The experimental results show that,compared with the original mm-tSNE,mntSNE regularization algorithm,the mm-tSNE regularization based Nesterov momentum method obtains better visualization results and can better represent non-transitive similarity between data.2)Adaptive learning-based mm-tSNE regularization method to achieve high-dimensional biological data visualization.In the process of optimizing the target loss function,the learning rate is one of the hard-to-set hyperparameters,which has a significant impact on the visual performance of the model.Traditionally,a fixed learning rate is used to evaluate the gradient of each parameter iteration.A single learning rate is not suitable for all gradient search directions,and the loss function is usually more sensitive to certain directions in the parameter space.A high learning rate will have a good efifect on the direction of low curvature,but it will deviate from the direction of high curvature.This paper presents an adaptive algorithm learning mm-tSNE regularization visualization method,using RMSProp method to adaptively set the learning rate according to curvature information,and using the gradient exponential moving average of each iteration parameter to normalize the gradient,so as to provide better more timely gradient correction.The experimental results show that the adaptive learning-based mm-tSNE regularization method obtains faster convergence than the original algorithm,reduces the error rate of the target loss function,and proves that the non-metric attributes are in the microbial dataset,biological dataset,etc.datasets are ubiquitous.Using the new optimization method to learn the objective loss function of the algorithm can better relieve the constraints of the nature of the non-metric space,so the non-transitiveness of the similarity data can be expressed better and faster.
Keywords/Search Tags:Nesterov momentum, mm-tSNE regularization, adaptative learning, RMSProp, High-dimensional data visualization
PDF Full Text Request
Related items