Font Size: a A A

Research And Design Of User Network Behavior Analysis And Mining System Based On Clouding Computing

Posted on:2016-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:J F PiFull Text:PDF
GTID:2428330482481303Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
In the information-based society,more and more people like to get information and communicate through the Internet.In recent years,views of many portals and social networking sites are growing exponentially,Internet users view different pages of news and express opinions according to their needs.For large websites such as Sina,Sohu,one day's visit views almost reach the level of TB.Meanwhile,users are puzzled when facing the massive information,Site makers also are difficult to provide personalized services customized for users.Thus,systems that have capable of analyzing users' network behavior becomes very meaningful.However,digging out from TB levels of user access information data in the user network behavior characteristics need for efficient and reliable technology as support.With the constantly development of cloud computing and software framework,a system platform with these capabilities has become possible.As a business computing model,Cloud Computing distribute tasks to the computer cluster,to enable it to parallel processing,and can dynamically allocate resources in the cluster based on the amount of data of the task.Among the many cloud computing platforms,Hadoop is the most widely used,HDFS and Map/Reduce as its distributed file systems and computing framework.Open source,scalable,and reliable is its biggest advantage.By breaking the task into many small tasks to perform parallel processing,each results is aggregated to get the final result.As Hadoop's data warehouse,Hive can provide a way to operate that is similar to operate the traditional database.Sqoop provides a quick way to ransfer data from Hadoop cluster to the traditional.By using those cloud computing technologies,this paper proposes a system that mining the user's network behavior.By establishing the Hadoop cluster on Linux,user access log information will be uploaded to the cluster,using Map/Reduce computing framework and Hive to numerous mining,analysis and count the key indicators of websites and the user network behavior characteristics.Use Sqoop to export data to a traditional database.Use web server based on B/S structure to access traditional database and presented to website decision makers.The main research contents are listed as follows:analysis and design of the core module,analysis and processing of logs' key indicators,design and Implementation of Map/Reduce and Hive multiple mining.
Keywords/Search Tags:Cloud Computing, Hadoop, Hive, Sqoop
PDF Full Text Request
Related items