Font Size: a A A

Research On Website Optimization Strategy Based On Frequent Pattern Mining

Posted on:2017-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:H R ZhangFull Text:PDF
GTID:2348330503983626Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of information technology, the application of the Internet has been a great degree of popularity. Various portals were designed for people. It enriches the way people get information, but also brings some problems. It is very difficult to obtain knowledge what they need from a wide variety of portals for a user.Therefore, it is a question to optimize the topological structure and page content of the site to website builders. It must be user-friendly to obtain information they interested in from the vast amounts of data. Web log mining is a kind of comprehensive technology to mining information which is valuable for users from web log data. Frequent pattern mining is a commonly method of Web log mining. We can obtain the users' browsing path through frequent pattern mining.This paper describes the process of Web log mining and the content of frequent pattern mining. We focus on the combination of page interests and frequent pattern mining algorithm. Hence, we propose the dual constrained multi-supports frequent pattern mining algorithm based on page interests. The main contents are as follows.(1)We propose the dual constrained multi-supports frequent pattern mining algorithm(DS_MSA). This article briefly describes the current existing frequent pattern mining algorithms. In order to solve the problem of rare item dilemma and rules explosion problem existing in current algorithms, we propose the DS_MSA algorithm.This algorithm uses multiple minimum supports. We use the itemsets weights to make sure the constraint which could obtain the minimum support of itemset. In this algorithm, both the quantity and quality of mining results are improved greatly compared with other algorithms using different datasets.(2) We present a new method for the page interest calculation. In order to expressthe different meaning of page for user, we use the degree of page interest to express the importance of each page to user. The page interest calculation method proposed in this paper can be expressed considering the users' browsing behavior, the number of page appears, the speed of page views and the number page be linked. It is more scientific compared with the previous page interest calculation algorithm. By contrast with the dominant user data, the effectiveness of the algorithm is confirmed.(3) The method for the page interest calculation algorithm and DS_MSA algorithm are combined in this paper. The DS_MSA algorithm is applied to Web log mining. In order to characterize the importance of each page, page interest is used to show it and determine the constraint condition of the algorithm. We use the dual constrained multi-supports frequent pattern mining algorithm based on page interests to obtain information in web log data of Chongqing agricultural and rural information network.Then, the results of mining can be used to optimize and improve the website. A specific optimization strategy also briefly described.The main innovations of this paper are improving the calculation method of page interest degree and restricting the minimum support by two constraints. Our mining results are closer to the users' interests and improve rules explosion and rare items dilemma in a certain extent. According to the mining results, we could achieve the goal of optimizing the website using the characteristics of agricultural website users.
Keywords/Search Tags:Web log mining, Frequent pattern mining, Page interests, Frequent user access patterns
PDF Full Text Request
Related items