Font Size: a A A

The Largest Internet Website User Behavior Analysis Data Platform

Posted on:2013-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:P ChengFull Text:PDF
GTID:2248330395450458Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of computer and network technology, Internet applications become common place. For medium and large Internet websites which have a huge amount of data about users, it has very important application value and practical significance to take effective data analysis and mining in these data and get the information of user behavior that has real commercial value, which can supply better service for the users. This thesis does the in depth analysis of the features of data platform dealing with mass user behavior data in medium and large websites, and designs a data analysis and information mining system for user behavior data. Moreover, the thesis analyzes and designs the data platform architecture and data mining model.Data support is needed in the process of website operation. In allusion to this feature, this thesis set up a real-time data transmission platform and a data analysis platform to satisfy the requirement of different real-time application in the website. To meet the request requirement of real-time stream data, the real-time data acquisition system adopt a method combined nginx and resin together to accept the data. After that, the system sends the data to message middleware, and achieves the target of decoupling the real-time data acquisition system and data analysis system. This thesis adopts the high-performance mina frame to satisfy the need of system data interaction, imports the result of data analysis to the cluster of MongoDB, and provides these results to the business query function of front end.Because it do not need very strong instantaneity to analyze the log data of user behaviors deeply according to the product requirements, this paper constructs a none-realtime data analysis system to do the data calculation offline. This platform needs to get a balance between the processing efficiency and the data storage efficiency of large data capacity.This thesis adopts middleware technology in the design of system architecture to improve the rationality and flexibility, which ensures the high efficiency of system operation. In addition, the thesis discusses and designs a middleware cache management strategy and the corresponding cache invalidation management techniques.
Keywords/Search Tags:user behavior analysis, real-time dataminjng, middlewaretechnology
PDF Full Text Request
Related items