Font Size: a A A

Research And Implementation Based On NoSQL Massive Data Analysis Engine

Posted on:2017-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:K RenFull Text:PDF
GTID:2358330482499343Subject:Computing technology
Abstract/Summary:PDF Full Text Request
With the development of Internet and mobile Internet, the data are growing exponentially every moment, how to mine users's behavior and relations quickly, efficiently and deeply is very important topic. Effective data mining can not only evaluate existing systems and get real users's feedback, but also can mine users's attributes and relationships, it can help us improve product's quality meanwhile. Therefore, to design a massive data statistics engine with features like real-time, fault tolerance, concurrency, data compression, multi-dimensional is very meaningful. We can accurately obtain the user behavior data and build a data mining model through massive data analysis, it greatly improves statistics and data mining efficiency.In this paper, we will use MongoDB, Ruby, Docker and other big data technologies to develop massive data analysis engine, this engine is better than most of the data analysis engine on market. It has many feature, such as customizable dimensions, processing data without delays, concurrency, dynamic scaling, high fault tolerance, supporting high-performance storage and so on. We can quickly customize the dimensions and strategies with Ruby DSL on this engine. The ORM mongoid can greatly improve the efficiency of data mining. In this paper, MongoDB auto-sharding and replication solutions can ensure the system's fault tolerance and horizontal scaling. MongoDB MapReduce improves massive data processing efficiency. With Docker it's easy to build a test or production environments, we will build basic software cloud services and MongoDB clusters with Docker's LXC virtualization technology. Docker can help us solve traditional build problems, such as multi-version environment, multi-system, multi-configuration environment, high test error rate, difficult product release and so on. So it is easy to build development, test and production environments rapidly with Docker.Finally, we will set up the experimental environment to complete Docker virtualization platform, build MongoDB cluster experiments in Docker environment, complete the massive data analysis system through Ruby and MongoDB.
Keywords/Search Tags:NoSQL, MongoDB Cluster, Docker Virtualization Technology, Cloud Computing, Massive Data Analysis
PDF Full Text Request
Related items