Font Size: a A A

User Network Behavior Analysis In Spesific Industry Based On Web Data

Posted on:2018-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y PengFull Text:PDF
GTID:2348330518996363Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Web data contains rich information about user network access patterns, it has great significance for the analysis of user network behavior. The web data in this paper is consist of two parts. One part of web data is provided by a domestic operator based on DPI packet inspection technology, mainly the user web log data, the other part of web data is the web page data captured by web crawler. Based on the two kinds of web data, user network behaviors on e-commerce and automobile websites were highlighted for analysis.(1) The analysis of user network behavior in e-commerce industry.The user of JD, Tmall, Gome, Suning websites was taken as the object of research, and the basic statistical analysis method with MapReduce was employed by analyzing user network access behavior on some commodities, including the browsing, searching and adding shopping cart behaviors. The BulkLoad tool is used to batch import the user behavior data into HBase table, which solves the problems such as slow response of the system caused by frequent IO and GC operation, and node exit with times out, it increases the stability of the cluster system and improves the efficiency of data storage. Finally, HBase data query interface is used to customize user behavior data on e-commerce websites,that is to say, user behavior data can be centrally inquired according to the specified conditions.(2) The analysis of the user network behavior in automobile industry. The Ford Edge car was taken as the research sample, using AprioriAll sequence pattern mining algorithm to get the frequent sequence sets of user access to automobile websites (the automobile websites of top 15 ), analyzing the users who were interested in this car usually tend to get related information on which automobile websites, and how to visit these websites in the order. Then, combined the statistical features of MapReduce and RESTful API technology, made the analysis of user network access and interest tags visible. In addition, the regular expressions were used in the extraction of automobile user data, and RegexBuddy tool was introduced to debug and optimize regular expressions, finally combined with the data storage of Hash characteristic,time complexity of the user data extraction program was downed from O(N) to O(1), made the running efficiency of the program improved.The result of research in this paper, may can give some dealers or advertisers of specific products and cars some recommendations in terms of user positioning, advertising precision delivery or advertising cross-push.
Keywords/Search Tags:Web dataset, MapReduce, sequential pattern mining algorithm AprioriAll, automobile industry, e-commerce industry
PDF Full Text Request
Related items