Font Size: a A A

The Research And Application Of Collaborative Filtering Algorithm With Big Data Visualization Model

Posted on:2016-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:X Q GuanFull Text:PDF
GTID:2428330473465638Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With data generated rapidly by the Internet and bioinformatics and sensors and so on,the big data and the distributed processing have become a more popular topic today.At the same time,the analysis of big data,particularly mining valuable information from big data,and the data visualization with a very good tool are the research trend in the big data and visualization field.However,the kind of product,which is combined by the big data processing framework and visualization tools,is rare.In the information age,all kinds of data are just like natural resources,and the demand for the product becomes more imminent.This paper designs applicatio n architecture of the visualization and the big data processing,including the following several parts,big data visualization algorithm model of research and development,production system,the actual application data processing.1)The big data visualization algorithm model of research and development mainly reflects the convenience and the design in paralleling algorithm.The data analyst or the data scientists can use this model to design their own parallel algorithm,and for big data analysis,and visualization of the data.In this paper,we propose the big data visualization algorithm analysis platform model.The model solves the tradit ional can only use R with data sampling analysis,while the model can analyze the all data with R.The model makes full use of the advantages of R and the big data processing with Hadoop framework.In order to verify the model,this article implements the parallel collaborative filtering algorithm based on DAG.The experimental results show that this model has a good expansibility and operabilit y by analyzing,at the same time,the algorithm itself of extensibility appropriate increase.The system expounds in detail in chapter 3.2)Big data production system is mainly manifested the algorithm integrated,then data analyst use the algorithm to analysis the real-time recommendation system.This application is mainly used to the big data processing and storage architecture,such as the distributed file system HDFS and various calculation models with Map Reduce and Spark and so on.The practical application system in the large data frame can run multiple strategies;support multiple algorithms within the framework to test multiple large data application production system.The practical application system contains many large data integration algorithm,such as the Mahout of clustering,classification and recommendation algorithm,include Map Reduce Spark,Streaming,Machine Learning and Graph Processing of generic algorithms library.Data analysts can directly call these algorithms,detailed design,please refer to chapter 4.Using the integration of visualization and big data processing and the application of architecture,data scient ists can directly using real data to research all kinds o f algorithm design,mining the long tail,according to the valuable information,for the industry to speed up the real-time decision-making,so as to create greater value for the society.In this article,we design and implement an improved parallel collaborative filtering algorithm with our design the big data visualization model.Then we use a variety of algorithms to compare the experimental results.Through the experimental results,the big data visualization model shows the good extensibilit y and data processing abilit y.On the basis of this,we design and implement the integrated application of kinds of recommendation algorithm.
Keywords/Search Tags:Big data, R, MapReduce, Spark, visualization, Algorithms library
PDF Full Text Request
Related items