Font Size: a A A

Research On Cloud Data Mining For Sentiment Classification Task

Posted on:2013-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:X J XiangFull Text:PDF
GTID:2268330431962041Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The growth of data has become bottleneck of knowledge discovery. Cloud computing has provided powerful storage and parallelization processing ability. A typical application of cloud computing is distributed data mining. In order to solve the problem of specific data mining task, it is important to design a cloud data mining framework for sentiment classification task using workflow modeling tool, workflow engine and cloud computing platform, which helps users complete the process of task customization, task resolution, task execution, monitoring and result representation.Main contributions of our work can be summarized as follows:First, we present an overview of cloud data mining, and introduce in detail cloud data mining platform Hadoop and workflow engine Oozie for Hadoop.Second, we design a cloud data mining framework for sentiment classification task. The structure is described and the functions of each component are presented in detail. The main function of the framework is to help user complete task customizing, task resolution, task execution and monitoring and result representation. The system consists of three components, task customization tooling, workflow engine and cloud computing platform. Task customization tooling is responsible for task definition and description. Workflow engine is responsible for task scheduling and management. Cloud computing platform is responsible for implementing the concrete sub-tasks.Third, we work on three sub-tasks of sentiment classification task, including feature extraction, term weighting and classification. And then a series of sentiment classification algorithms are paralleled, which are suited for large-scale data. Based on these algorithms, users can customize task by selecting the algorithms according to the feature of data or requirement, which demonstrates the system’s flexibility and scalability.Fourth, we give the precision and time costs evaluation of the algorithms by comprehensive experimental analysis. The experiments results show that the algorithms are effective and the system is scalable.
Keywords/Search Tags:Cloud Data Mining, Sentiment Classification, Workflow
PDF Full Text Request
Related items