Font Size: a A A

Distributed learning: Regression on attribute-distributed data and consensus clustering

Posted on:2012-10-04Degree:Ph.DType:Dissertation
University:Princeton UniversityCandidate:Zheng, HaipengFull Text:PDF
GTID:1458390011451450Subject:Engineering
Abstract/Summary:
This dissertation is a compilation of four different studies that are united by their relevance to attribute-distributed learning, both supervised (regression) and unsupervised (clustering). Regression on attribute-distributed data is first discussed. The theoretical performance limits of a linear ensemble estimator are investigated, and an iterative training protocol with low test error and high robustness to irrelevant agents is proposed. Motivated by quantifying the trade-off between communication and performance in regression on attribute-distributed data, an iterative training algorithm based on inaccurate estimates of the covariance matrix of individual training residuals is designed, and tested under different amounts of data-exchange. In order to reduce data exchange and the negative influence of irrelevant agents, an intelligent agent selection algorithm based on heuristics is proposed and tested. Finally, motivated partly by solving attribute-distributed clustering problems, a computationally efficient algorithm, the Filtered Stochastic Best-Multiple-Element-Move (BMEM) algorithm, is designed and investigated, which provides superior computational efficiency as well as better final results compared to other local search algorithms for consensus clustering.
Keywords/Search Tags:Attribute-distributed, Clustering, Regression, Algorithm
Related items