Font Size: a A A

Design And Implementation Of Web Log Analysis System Based On MongoDB

Posted on:2015-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:S Y SunFull Text:PDF
GTID:2268330428469057Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the growth and expansion of the Internet enterprise scale, web log information alsogrow in step with it. In order to provide better service, understanding the features and needs ofthe user’s access, analyze and study the user’s behavior is necessary, so was analysis of theWeb logs coming. It combined the traditional data mining and web logs, get the usefulinformation from a large number of web logs data, count and analyze the users’ behavior andpage view, to infer the user’s access modes. It can play a role in many situations, such asnetwork security, build the web site and the market analysis of the e-commerce. It is a newresearch direction of data mining.NoSQL is the general name of the non-relational databases, which is a new data storagetechnology to meet the needs of the rapid growth of the Internet applications. Since it is easyto extend, has high write and read performance even for the large amount of data, and hasflexible data models, it have been well developed in some application scenarios. MongoDB isa representative of NoSQL databases, the document-oriented data model it used make it canautomatically split the data and store them on different machines. This automatic slicingmechanism achieve a distributed extension, the collections and documents in the database canbe stored in the many database nodes. The applications of MongoDB is very widely, since itsgood horizontal scalability, so it is suitable for storing low-value and large-sized files, offer adata manage technology which satisfied the high concurrency and the magnanimous dataprocessing for the Internet develop to cloud computing. Internet provides to meet the highconcurrent development of cloud computing. Such characteristics make it has a gooddevelopment in the field of web log analysis.This paper mainly studied the design of an efficient web log analysis program based onthe distributed database MongoDB. The so-called Web log analysis is to gather and store thelog information which generated when users access the web pages, and then transform, cleanand excavate. This article compared MongoDB database with traditional relational database,analyze its advantages and application scenarios. Its anti-paradigm design due to the nestavoid the association, making queries and storage of the large data efficiency. By storing weblogs in the MongoDB and directly analyze the logs with its built-in MapReduce programmingmodel, and save the results of the analysis as files for business people to use. Aims to discover the hidden users’ access rules and patterns in the log data by effective data mining ofthe web log data, offer helpful information for optimizing website structure and businessmodel.
Keywords/Search Tags:log analysis, MongoDB, MapReduce
PDF Full Text Request
Related items