Font Size: a A A

Optimization for Regression, PCA, and SVM: Optimality and Scalability

Posted on:2016-06-07Degree:Ph.DType:Thesis
University:Northwestern UniversityCandidate:Park, Young WoongFull Text:PDF
GTID:2478390017971473Subject:Operations Research
Abstract/Summary:
As profit margins are squeezed, even a slight improvement in the solution quality is critical in many industries and academic fields. Also, due to increasing data size, capability of solving large-scale problems becomes important in many fields. However, many optimization problems in machine learning are currently solved non-optimally due to lack of better algorithms. This thesis contributes to resolve this issue by focusing on designing scalable global optimal algorithms for hard-to-solve or large-scale machine learning problems.;While the ultimate goal is to provide optimal solutions by means of scalable algorithms, the focuses are different in each of the three chapters in this thesis depending on the problem types. In Chapters 1 and 2, the focus is on providing optimal or improved solutions for hard-to-solve problems, whereas in Chapter 3, the focus is on improving computational efficiency to be able to solve large-scale problems. The problems studied in this thesis are important parts of a statistical learning process. Feature selection is crucial in constructing statistical learning models and is done by principal component analysis (PCA) and regression subset selection such as the works in Chapters 1 and 2. Further, due to availability of large scale data, the need and importance of tractable machine learning algorithms, such as the algorithm in Chapter 3, are increasing.;Chapter 1 presents mixed integer programs to find the best subset for multiple linear regression, where the formulations are the first mathematical programming models that directly optimize the mean absolute error and mean squared error. The computational experiment shows that the quality of solutions is significantly improved (up to 30% from stepwise heuristics for the selected data) by the proposed models and algorithms. The MIP models can easily incorporate logical constraints, while stepwise heuristics and most of other algorithms cannot.;Chapter 2 studies optimization of absolute error for PCA and developed convergent algorithms based on iteratively reweighted least square, singular value decomposition, and mathematical programming. One of the convergence results is generalized to show that the eigenvalues of a series of symmetric convergent matrices are convergent. The computational experiment shows that both of the algorithms outperform the benchmark algorithms in the literature in the presence of significant outliers (up to 15% improvement for the selected data).;Chapter 3 develops a general algorithmic framework, called Aggregate and Iterative Disaggregate (AID), for machine learning problems such as least absolute deviation regression and support vector machine (SVM). The algorithm is designed for large-scale problems based on clustering and data aggregation. It is shown that the algorithm is monotonically convergent for some of the selected problems and reduces a significant amount of computational time as data size increases (up to 9 times faster than the state of the art packages and software for the selected data).
Keywords/Search Tags:PCA, Selected data, Regression, Algorithms, Machine learning, Optimization, Optimal
Related items