Font Size: a A A

Research On Sparse Modeling And Optimization Computation In Data Science

Posted on:2019-05-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:J J YangFull Text:PDF
GTID:1368330578983012Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
In the era of big data,big data technologies such as data science and machine learn-ing are extensively studied by researchers.The optimization problem is understood to be essential to big data analysis,modeling,and computation.The design of efficient and high-precision algorithms is at the core of solving optimization problems.For instance,it is known that many machine learning problems are essentially(or transformable to)optimization problems,which means optimization lies at the heart of machine learning.In other words,optimization models are constructed based on the theory of optimiza-tion in the context of data modeling and analysis.In practice,there is often redundant information in big data modeling from a perspective of optimization problem-solving.One way to improve the efficiency of algorithms is to discover and exploit the particu-lar properties of the problems under consideration.Among the exploitable properties,sparsity is of particular interest in data science.By inducing the sparse regularization term,sparse modeling and the corresponding sparse optimization methods are proposed and applied.This dissertation is based on the aforementioned optimization method.To be more specific,this dissertation covers the following topics:(1)alternating direction method of multipliers(ADMM)under pair-wise linear constraints;(2)sparse optimiza-tion modeling in seismic inversion;(3)the theory of group partial regularization sparsity optimization;(4)fuzzy maximum margin smooth equalization clustering method.ADMM is an efficient method to solve large-scale sparse optimization problems.It divides high dimensional problems into multiple low-dimensional subproblems by aug-mented Lagrangian function and solves them iteratively.However,when there are more than three variables or the optimization problems being non-convex,the convergence of results is still an open problem.In this dissertation,ADMM is initially proposed to solve this optimization model under a set special constraint conditions,called pair-wise linear constraint conditions.Subsequently,ADMM is extended to the convex optimization problem with multiple separate variables.This dissertation provides theoretical analy?sis proving the algorithms to be convergent;additionally,the dissertation illustrates the practical experiments on both synthetic and real datasets that provide practical evidence for the theory.As interdisciplinary studies abound,optimization theory has become a powerful tool in applications engineering.One example is that compressed sensing algorithm has been increasingly vital in geophysics,along with a number of successful applications in the inverse problem of seismic and geophysical.In the context of seismic source inversion,the position of the energy radiation position is sparse during seismic rupture.Thus the sparse optimization model has proposed accordingly.This model satisfies the pair-wise linear constraints after applying the variable separation method.Thus,the resulting optimization model is readily solvable by ADMM.Finally,a series of experiments also show that ADMM noticeably improves the accuracy and real-time performance in the process of inversion.In many applications in engineering,the sparsity of solutions is required.How-ever,the basic l0-norm minimization problem is an NP-hard problem.It is known that l0-norm problem is equivalent to the l1-norm problem under some strong conditions.However,there is normally a deviation when using the l1-norm to approximate the l0-norm,due to the fact that the larger components of the vector make a greater contri-bution to the l1norm than l0-norm.A new group partial regularization model specially designed for this sparse optimization problem is proposed.In this dissertation,the ex-istence and sparseness of the solution are theoretically proved.In data science,clustering models are extensively used in exploring data structure.Clustering analysis is an unsupervised learning method to unravel the underlying struc-ture through unlabeled data sets.On another front,the framework based on maximum margin theory has become a powerful tool for supervised learning.It is a promising study to base optimization theory and algorithm on this framework and extend it to unsupervised learning in the future.One way of doing this is by introducing the sim-ilarity measurement among data sets,constructing the graph structure and its Laplace matrix,combining with smooth regularization and equalization regularization,then fi-nally building a new fuzzy maximum margin smooth equalization clustering model.To solve this model,an iterative optimization algorithm has been proposed in this disser-tation.This dissertation further analyzes the consistent non-degenerate property of the solution of the clustering model and theoretically proves the convergence of the iterative algorithm.
Keywords/Search Tags:sparse optimization, alternating direction method of multipliers, pair-wise linear constrain, inverse problems, group partial regularization, clustering analysis, max-imum margin clustering
PDF Full Text Request
Related items