Data mining via support vector machines: Scalability, applicability, and interpretability

Posted on:2005-07-10

Degree:Ph.D

Type:Thesis

University:University of Illinois at Urbana-Champaign

Candidate:Yu, Hwan-Jo

Full Text:PDF

GTID:2458390008994708

Subject:Computer Science

Abstract/Summary:

KDD (Knowledge Discovery and Data mining) has been extensively studied in the last decade as data is continuously increasing in size and complexity. This thesis introduces three practical data mining problems---(1) classifying with large data sets, (2) classifying without negative data (i.e., single-class classification), and (3) discovering discriminant feature combinations---and presents solutions that are based on a principled methodology, i.e., Support Vector Machines (SVMs), to produce higher quality results with less human intervention. We first address several challenges in adopting SVM technology to the practice of data mining: (1) scalability: SVMs are unscalable to data size while common data mining applications often involve millions or billions of data objects, (2) applicability: SVMs are limited to (semi-) supervised learning which is mostly applied to binary classification problems, and (3) interpretability: It is hard to interpret and extract knowledge from SVM models. We then propose three principled solutions, which address these challenges, for the problems of the large-scale classification, the single-class classification, and the discriminant feature combination discovery. The contributions of this thesis cover the applications of bioinformatics and text-and-Web mining as well as methodologies of data mining and machine learning.

Keywords/Search Tags:

Data mining

Related items

1	Applications Of Data Mining For The Competitive Intelligence System In The Enterprise
2	Based On Data Mining, Web Mining System
3	Study On Several Typical Data Mining Methods And Their Applications
4	Research On Technologies And Application Of Data Mining For PLM
5	Web-based Data Mining Technology
6	Research On The Technology Of Web Log Mining
7	Web-Based Data Mining Technology Research And Application
8	Research And Application Of Algorithm In Data Mining Based On Oracle Data Mining API
9	Data Mining Applications, Decision Support Systems In Auto Sales
10	Multi-Users Online Visual Data Mining System