Font Size: a A A

Research And Application Of Hadoop In Business Intelligence

Posted on:2017-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:G QiuFull Text:PDF
GTID:2348330503468016Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the rapid development of information technologies of Internet, it has come into the age of big data nowadays. Along with the rapid expansion of data size, how to get the valid information when the decision makers and customers are facing these huge amounts of data becomes the primary problem. Business Intelligence(BI) development makes it possible to turn the data into knowledge, and the recommendation algorithm in BI also builds an effective connection among products, information and customers. In addition, the development of Hadoop platform researching makes contribution to using distributed methods to analyze big data more efficiently and conveniently. Based on the current research on personalized recommendation algorithm, this article proposed a distributed collaborative filtering recommendation algorithm based on gray association analysis by using grey system theory and distributed platform processing methods for big data.Firstly, the article discussed collaborative filtering recommendation(CFR) algorithm and implementations of user-based, item-based and model-based CFR algorithms. Secondly, the article introduced grey system and grey relational analysis at first, and then, introduced characteristics and calculations of different Grey Relational Grade(GRG) model in the following section. Thirdly, based on the Hadoop ecosystem, storage and read-write principles of distributed system, and design principle of distributed database was discussed. At the same time, the article introduced parallel computation and scheduling by studying MapReduce framework. According to the above theories, in order to solve the problem that the current CFR algorithm has incomplete and uncertain facts in rating matrix, data sparsity, bottleneck of computing and scalability in big data, this subject combined grey relational analysis with distributed theory, and proposed a distributed collaborative filtering recommendation algorithm based on gray association analysis. Finally, this article designed Hadoop distributed system and distributed database environment on servers cluster, and implemented the algorithm on it.The experimental results illustrate that the CFR algorithm proposed in this article can effectively implement large-scale data recommendation and decrease mean absolute error compared to classical CFR algorithm. And it also can solve the problem of data scalability by adding data node into the Hadoop cluster. In addition, the effective application between Hadoop and Business Intelligence also proves the feasibility of the recommendation algorithm proposed in this article.
Keywords/Search Tags:business intelligence, grey relational analysis, collaborative filtering recommendation algorithm, distributed system, Hadoop
PDF Full Text Request
Related items