Font Size: a A A

Research And Implementation Of XML Document Classification Based On Extreme Learning Machine In Cloud Environment

Posted on:2014-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:J Y DingFull Text:PDF
GTID:2308330473951048Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
XML document classification has great significance in both academic and application fields. How to realize XML classification considering semantic and structural information with fast learning speed has now become one of the hottest subjects. Besides, with the development of the Internet, cloud computing technology has pioneered the massive data processing technologies due to its tremendous parallel computing power.Extreme Learning Machine (ELM) has shown its good generalization performance and extremely fast learning speed in both regression and classification applications. Recently, it has been proved that ELM and Support Vector Machine (SVM) are actually consistent from the optimization point of view. ELM can use a widespread type of feature mappings to unify algorithms based on both ELM and SVM. However, with the exponentially increasing volume of training data in massive learning applications, centralized ELM with kernels suffers from the great memory consumption of large matrix operations.This paper proposes a novel distributed solution for massive XML classification, including a distributed XML representation model converting algorithm MR-SLVM and a distributed kernelized ELM algorithm DK-ELM, which are both efficiently implemented on MapReduce. To ensure the parallelism of DK-ELM, Stochastic Singular Value Decomposition (SSVD) is applied to realize distributed matrix inversion. Furthermore, two sub-algorithms are proposed, namely Distributed Radial Basis Function (D-RBF) and Distributed Matrix-Vector Multiplication (DMXV).Finally, Extensive experiments on massive real world datasets are conducted on a cluster to verify the scalability and training performance of MR-SLVM and DK-ELM. Experimental results show that DK-ELM is efficient for massive learning applications with good scalability.
Keywords/Search Tags:cloud computing, massive data, XML classification, extreme learning machine
PDF Full Text Request
Related items