Font Size: a A A

Intractability results for problems in computational learning and approximation

Posted on:2010-05-24Degree:Ph.DType:Thesis
University:Georgia Institute of TechnologyCandidate:Saket, RishiFull Text:PDF
GTID:2448390002976670Subject:Mathematics
Abstract/Summary:
In this thesis we prove intractability results for some well studied problems in computational learning and combinatorial optimization.;Minimizing and learning DNF expressions. We study the problem of finding the minimum size DNF formula for a function f : {0, 1}d {0, 1} given its truth table. We show that unless NP ⊆ DTIME( npolylogn ), there is no polynomial time algorithm that approximates this problem to within factor d1-epsilon where epsilon > 0 is an arbitrarily small constant. Our result essentially matches the known O(d) approximation for the problem.;We also study the learnability of small size DNF formulas. We show that assuming NP ⊈ RP, for arbitrarily small constant epsilon > 0 and any fixed positive integer t, a two term DNF cannot be PAC-learnt in polynomial time by a t term DNF to within ½ + epsilon accuracy. Under the same complexity assumption, we show that for arbitrarily small constants mu, epsilon > 0 and any fixed positive integer t, an AND function (i.e. a single term DNF) cannot be PAC-learnt in polynomial time under adversarial mu-noise (also known as agnostic learning) by a t-CNF to within ½ + epsilon accuracy. Our results improve upon the previously known hardness for these problems [1, 30, 31].;Learning intersection of two halfspaces. We show that unless NP= RP, it is hard to PAC-learn intersection of two halfspaces in Rn using a hypothesis which is a function of up to ℓ linear threshold functions for any integer ℓ to within accuracy of ½ + epsilon for any constant epsilon > 0. Specifically, we show that for every integer ℓ and an arbitrarily small constant epsilon > 0, unless NP = RP, no polynomial time algorithm can distinguish whether there is an intersection of two halfspaces that correctly classifies a given set of labeled points in Rn , or whether any function of ℓ linear threshold functions can correctly classify at most ½ + epsilon fraction of the points. Our result is optimal up to an arbitrarily small constant epsilon > 0, and improves upon the previous NP-hardness for this problem [1].;Reconstructing multivariate polynomials over F [2]. We study the polynomial reconstruction problem for low-degree multivariate polynomials over F [2]. In this problem, we are given a set of points xi ∈ F [2]n and target values zeta i ∈ F [2] for each of these points, with the strong promise promise that there is a linear polynomial over F [2] that agrees with at least 1 - epsilon fraction of the point-value pairs. Our goal is to find a degree d polynomial that has good agreement with the set of point-value pairs. We show that it is NP-hard to find a polynomial that agrees with more than 1 - 2-d + delta fraction of the pairs for any constants epsilon, delta > 0 and positive integer d. This extends the previously known hardness of approximation (or even NP-completeness) for the case when d = 1, which follows from a celebrated result of Hastad [38]. In the setting of Computational Learning, our result shows the hardness of agnostic learning of parities, where the learner is allowed a low-degree polynomial over F [2] as a hypothesis.;SDP integrality gaps with Local ℓ1-embeddability . We construct integrality gap instances for SDP relaxation of the MAXIMUM CUT and the SPARSEST CUT problems. If the triangle inequality constraints are added to the SDP, then the SDP vectors naturally define an n-point negative type metric where n is the number of vertices in the problem instance. Our gap-instances satisfy a stronger constraint that every sub-metric on t = O((log log log n) 16 ) points is isometrically embeddable into ℓ1. The local ℓ 1-embeddability constraints are implied when the basic SDP relaxation is augmented with t rounds of the Sherali-Adams LP-relaxation [75].;For the MAXIMUM CUT problem, we obtain an optimal gap of a-1GW - epsilon, where alphaGW is the Goemans-Williamson constant [33] and epsilon > 0 is an arbitrarily small constant. For the SPARSEST CUT problem, we obtain a gap of O((log log log n) 113 ). The latter result can be rephrased as a construction of an n-point negative type metric such that every t-point sub-metric is isometrically ℓ1-embeddable, but embedding the whole metric into ℓ1 incurs distortion O((log log log n) 113 ).;Integrality gap for UNIFORM S PARSEST CUT. Arora, Rao and Vazirani [7] showed that the standard semi-definite programming relaxation of the UNIFORM SPARSEST CUT problem with the triangle inequality constraints has an integrality gap of O( logn ). They conjectured that the gap is bounded from above by a constant. In this study, we disprove this conjecture (referred to as the ARV-Conjecture) by constructing an O(log log n) integrality gap instance. Khot and Vishnoi [54] had earlier disproved the non-uniform version of the ARV-Conjecture.
Keywords/Search Tags:AND, Problem, Computational learning, Result, Log log, Integrality gap, Arbitrarily small constant, Term DNF
Related items