Font Size: a A A

Research On Technology Of Mining User Browsing Paths Based On Hadoop

Posted on:2016-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:B YaoFull Text:PDF
GTID:2308330482469568Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet brings the data explosion,web server on internet has accumulated a large number of logs.How from huge amounts of web log mining valuable information become one of the hot current studies.After effective mining and analyzing the web logs,and then discovering user preferred browsing paths,this not only can provide markers optimize the structure of the size,but also can provide the basis for the enterprise to develop a more perfect marketing strategy.This dissertation make correlational research on technology of mining user browsing paths based on Hadoop. The work mainly includes the follows three aspects.1. Proposed and implemented user browsing preference path mining algorithm based on the trust interest.Fully consider the user’s view of the page when the degree of interest to the extent of the page interest,combined user browsing path selection factors that placement of the page and other pages on the page of the link and other reasons, weighted measurement of the site topology structure map, the concept of trusted choice is put forward, integrated metrics that trust selection and page interest considerations, proposed a trusted degree of interest, user browsing preference path mining algorithm based on the trust interest(MUPCDI) is proposed and implemented.2. Proposed and implemented that user browsing preference path mining algorithm based on the trust interest which by MapReduce, run to the distributed environment of Hadoop platform, which analysis of user browsing path, implementation of the massive web log for user browsing preferred path mining.3. As for target data,design comparative analysis that the threshold, accuracy and efficiency of the proposed algorithm by user browsing preference path mining algorithm based on the trust interest(MUPCDI). Meanwhile design comparative analysis that the high efficiency of the distributed platform by user browsing preference path mining algorithm based on the trust interest which by MapReduce.Above work shows, This dissertation propose a method for the confidence level of the proposed method for mining user browsing preference paths more accurate and efficiency. Meanwhile Web log for mining large data sets, on the distributed environment, this dissertation propose a method based on MapReduce the efficiency of mining user browsing preference paths is higher more than that of single machine environment.
Keywords/Search Tags:Web log, Preferred browsing path, MapReduce, Hadoop
PDF Full Text Request
Related items