Font Size: a A A

Research And Improvement Of Session Identification Methods In Web Logs

Posted on:2017-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y YuanFull Text:PDF
GTID:2358330485962759Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the continuous development of the Internet, more and more organizations, enterprises, institutions or exchange transactions through the network to the user. In order to retain existing users, for potential customers, the need to improve the user experience, to make their website more practical, more attractive. To achieve this goal, we must know the user interest, according to the characteristics of the user's access to optimize the structure of the site, the development of personalized service. Through the user access record, that Web log analysis can detect potential user access patterns, thereby to improve the structure of the site, the development of personalized services to enhance the user experience.Web log mining is an important sub-areas of Web data mining, and tap the potential, useful knowledge or patterns from data in the Web log. Web log mining session identification is an important step. This paper present a variety of session identification method is proposed an optimized session identification method, which is based on the degree of interest in the dynamic threshold page session identification methods. In this method, the dynamic pages and pages of interest of the average residence time of the combination generate dynamic threshold based on the page of interest to identify the session.The main work are:1) This paper systematically introduces data mining, Web mining concepts and classifications, and then Web log mining-related concepts, techniques and procedures set forth in detail, focusing on the pre-process Web log data mining.2) To generate personalized dynamic threshold to identify the session, this paper page of interest, namely the user to the page level of interest, based on the nature and user page browsing speed relative to the page of interest to quantify.3) for the current session identification method has the problem, the dynamic threshold session identification method, through the dynamic pages and pages of interest of the average residence time combined to produce a dynamic threshold based on the page of interest, make up the traditional time threshold session recognition can not be based on different users, different websites dynamically adjusted threshold insufficient. Experimental results show that, compared to a single fixed threshold current session identification used to identify a session, the proposed method can make better use of the characteristics of users and pages, more reasonable and effective.
Keywords/Search Tags:Web log mining, Session identification, Page interest, Threshold
PDF Full Text Request
Related items