Font Size: a A A

Large Data Analysis Of User Behavior Based On Web Log

Posted on:2019-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:Z X SongFull Text:PDF
GTID:2428330545479852Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In a recent period of time,big data as a key topic in the being,from a global management consulting firm Mc Kinsey said: "The data has already been widely applied to today's every profession,into the most critical factors of production.Digging up the hidden content of big data and using big data effectively can boost productivity." With the advent of "big data era",the widespread use of big data,make have multiplied the number of data,we will regard these data as before some useless junk on the net,but with the development of technology and the data collection and analysis,the profound understanding to the data in the important role of life,and it can guide the development direction of some enterprises.How to use these data now,and find some hidden rules from these data,is a hot topic in current research.Network operators is to perform network operations and supply service to users of the entity,in the supply of services to customers at the same time,still can save all the data users to view web pages,and rely on these data,all of the user's behavior will be network operators to learn,so that more conducive to different users against sales are more likely to buy their products,so as to make the site more precise,targeted marketing.Therefore,this paper is a Web log records of e-commerce sites,for example,to extract the data contains user attributes,and we can use the characteristics of attribute weighted Naive Bayes classifier,to classify the consumption propensity of different users.This article is implemented through the following points:(1)From the Web log pretreatment,user attributes extraction and user behavior analysis in three aspects,a Web log mining was about big data record of user behavior and details from the three aspects,introduced the first two aspects with highlight.(2)For redundancy data etc which don't need to clean,to identify whether for independent user,to identify whether for a new operating data pretreatment such as a detailed analysis of operation,is proposed based on the characteristics of the Spark attribute extraction method,for the user to access the Web log,to extract the user's attributes,such as the types of goods,the user's location,access and waiting time.(3)Based on the conditional independence of the attributes required by Bayesian classification,a Naive Bayesian classifier based on feature attributes is designed.Using the obtained attribute value of user behavior,the Naive Bayesian classifier is calculated and analyzed by using the characteristic attribute weighting,so as to classify the user's high and low consumption behavior.(4)Based on the framework of the Spark,Web log of different e-commerce sites for the network user's behavior data analysis,we according to the crawler to get users to browse the Web and purchase records,can forecast the user's purchase intention.
Keywords/Search Tags:Web log mining, log preprocessing, Na?ve Bayes classification, The Spark framework, User behavior analysis
PDF Full Text Request
Related items