Font Size: a A A

Research On Ticket Price Forecast And Task Assignment In Cloud Computing

Posted on:2015-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:C Z HuangFull Text:PDF
GTID:2268330431450062Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the emergence of Big Data, traditional data mining algorithms cannot meet the efficiency requirements nor directly process the big data. Cloud computing technology provides a new perspective for data mining. As an open source framework of Apache for distributed computing, Hadoop is widely used in a variety of large data processing environment with its reliability, efficiency and scalability. The acquisition and prediction of ticket price data belongs to the research scope of Big Data processing, which is with few research achievements at present. As an important research content in this thesis, the acquisition and prediction of ticket price is of great market demands and economic value. On the other hand, the existing Hadoop task assignment strategy cannot adapt to Big Data processing. How to improve Hadoop task assignment strategy and enhance job execution efficiency is another research content in this thesis.This thesis focuses on ticket price prediction under Hadoop environment and Hadoop task assignment problem under heterogeneous environments, the specific works are as follows:1) Collect ticket data using crawler technology and analyze the price variation, propose.the Cluster_Predict_Ticket algorithm based on ticket price density image in order to determine whether to buy the ticket or not.2) In order to solve the inefficiency during processing large amount of price data, Cluster_Predict_Ticket algorithm is immigrated to Hadoop platform using MapReduce framework to enhance the process efficiency, named PCluster_Predict_Ticket. Experiment results show that PCluster_Predict_Ticket is more scalable and efficient without losing predictive accuracy.3) Define and model HTA (Hadoop Task Assignment) problem in heterogeneous environments using minimum cost maximum flow. An algorithm named λ-Flow is proposed to divide the task assignment process into several rounds. In each round,λ-Flow collects the cluster state and the execution result of the last round dynamically, and assigns tasks according to the state and result. The comparison experiment result shows that the X-Flow algorithm performs better in a dynamic changing cluster than the existing algorithms, effectively reduces the execution time of a job. This thesis proves the scalability of the PCluster_Predict_Ticket algorithm and the effectiveness of λ-Flow algorithm, effectively solves the efficiency of ticket price prediction under Big Data environment, improves the execution efficiency of Hadoop job. The research on MapReduce algorithm of data mining in this thesis is not only limited to ticket price prediction, but also can be extended to other data mining environments. Meanwhile, the work on task assignment problem acts as a useful reference to cloud computing scheduling.
Keywords/Search Tags:Data Mining, Price Predicting, Hadoop, Task Assignment, MinimumCost Maximum Flow
PDF Full Text Request
Related items