Sparse and large-scale learning models and algorithms for mining heterogeneous big data

Posted on:2014-09-21

Degree:Ph.D

Type:Dissertation

University:The University of Texas at Arlington

Candidate:Cai, Xiao

Full Text:PDF

GTID:1458390008456481

Subject:Engineering

Abstract/Summary:

With the development of PC, internet as well as mobile devices, we are facing a data exploding era. On one hand, more and more features can be collected to describe the data, making the size of the data descriptor larger and larger. On the other hand, the number of data itself explodes and can be collected from multiple resources. When the data becomes large scale, the traditional data analysis method may fail, suffering the curse of dimensionality and etc. In order to explore and analyze the large-scale data more accurately and more efficiently, based on the characteristic of the data, we propose several learning algorithms to mine the Heterogeneous data. To be specific, if the feature dimension is large, we propose several sparse learning based feature selection methods to select the key words from the text or to find the bio-marker from the gene expression data; if the number of data itself is huge, we proposed multi-view K-Means method to do the clustering to avoid the heavy graph construction burden; if the data is represented or collected by multiple resources, we propose graph based multi-modality model to do semi-supervised learning and clustering. In addition, if the number of classes is large, we provides a global solution to the low-rank regression and proves that the low-rank regression is equivalent to doing linear regression in LDA space. We empirically evaluate each of our proposed models on several benchmark data sets and our methods can consistently achieve superior results with the comparison of state-of-art methods.

Keywords/Search Tags:

Computer

Related items

1	Bacteria - Phage Biological Computer
2	A study of the contributions of attitude, computer security policy awareness, and computer self-efficacy to the employees' computer abuse intention in business environments
3	Object Detection And Human-computer Interaction Based On UAV Platform
4	A study of the impact of computer applications supplementation in college computer literacy courses
5	Psychosocial influences of computer anxiety, computer confidence, and computer self -efficacy with online health information in older adults
6	Research On Computer Forensic For Storage Media
7	Research Of Intrusion Forensic System Based On Log Analysis Under Linux
8	Development Of Computer Controlled DC Speed Regulating Training System
9	Enterprise Computer System Safety Improvement Progress
10	Research And Application Of Computer Antagonism In Information Warfare