Font Size: a A A

Application And Implementation Of X10Language For Massive Data Processing In The Cloud

Posted on:2013-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:N Y SunFull Text:PDF
GTID:2218330371485198Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As for the rapid expansion of information, mass data processing has become animpressing problem in Internet era. Traditional languages are not designed for parallelcomputing. Therefore, they tend to slow down their computing speed when computingcapacity increases. Even if we build a distributed computing platform, the traditionallanguage software will also result in inefficiency for its own boundedness. This isundoubtedly the bottleneck of the traditional language. The X10language which is designedby IBM Research is just good at dealing with massive data. If we could change thetraditional languages into X10with parallel characteristics and arrange them onto the cloudcomputing platform to operate, those bottlenecks in data mining fields can get an effectivemitigation. And whether we can solve the problem of data processing totally depends on thedesign of the parallel computing of X10language.Firstly this paper analyzes the characteristics of the X10language and the differencebetween traditional language and X10in data processing. Then this article elaborates on themethod of redacting parallel computing with X10language, and analyzes the concrete stepsto build the MapReduce for X10in the cloud. After that, it provides Apriori algorithm,hot-topic extraction algorithm and collaborative filtering algorithm with X10language andmakes a performance test in the cloud of MapReduce. This test has achieved a goodefficiency performance. This result proved that parallel computing algorithm design in X10language can improve the performance in massive data processing. Lastly construct amicro-blog recommendation system using the three algorithms above, which can operateefficiently in the simulation test. The test result shows that X10language can solve theproblem of needing long computing time when dealing with mass data by traditionalalgorithms. This article provides a feasible solution for the parallel computing design ofX10language in the fields of massive data processing. And the test result demonstrates theeffectiveness of program above.The innovations of this paper are as follows: 1. Provide Data set allocation and data structure of X10language in the cloud;2. Provide the programming model of X10language in the cloud of MapReduce.3. Redact Apriori algorithm, the hot extraction algorithm and collaborative filteringalgorithm in X10language in the cloud.4. Construct the simulation environment, which can make X10language stable andefficient.5. Prove that compared to traditional languages X10language has certain advantageson massive data processing in the way of test.In summary, this paper implements the operation of typical data mining algorithms inthe cloud. Significantly improve the performance of the algorithm reformed with X10language in practical applications. The parallel data mining strategy of X10language in thecloud has important theoretical significance and application value. On the aspects of thedynamic control parameters selection and parallel algorithm scalability, need to do morefurther study. Hope that the continuous improvement of X10language can bring a betterperformance in practical application.
Keywords/Search Tags:X10language, Parallel algorithm, Data mining
PDF Full Text Request
Related items