Font Size: a A A

Distributed multifeature decision trees for classification

Posted on:2011-06-30Degree:Ph.DType:Dissertation
University:Oakland UniversityCandidate:Ouyang, JieFull Text:PDF
GTID:1448390002467654Subject:Engineering
Abstract/Summary:
Data mining techniques have greatly developed in the past several decades both in theoretical and practical aspects. In the past several years, as the available data volume and the demand for knowledge discovery have increased simultaneously, the data mining community is facing new challenges. In the past decade, two challenging topics of data mining have drawn extensive attention. One active research area is multi-label classification which is used in many applications such as web mining, multimedia retrieval and bioinformatics. Another challenge is knowledge discovery from geographically distributed data. There are several issues such as privacy, communication cost and the effectiveness of data mining methods involved in this task. This research work addresses theoretical and practical issues of these two subjects. Our main contributions include: (1) Extension of two popular decision tree algorithms to handle data mining tasks in a distributed environment. (2) a new efficient decision tree algorithm for multi-label classification, and (3) an exact extension of the new multi-label decision tree algorithm to handle distributed data.
Keywords/Search Tags:Decision tree, Data, Distributed
Related items