Web Usage Mining Based On Granular Computing

Posted on:2011-10-23

Degree:Doctor

Type:Dissertation

Country:China

Candidate:J Zhao

Full Text:PDF

GTID:1118360308963883

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The amount of information inside Web is increasing at surprisingly high speed. Applications urge people to abstract, filter and discover useful information from these data. Appling data mining techniques to Web usage data, Web usage mining aims at discovering various meaningful patterns hidden in usage data, which contains important theoretical and application value for providing personalization service, improving Web server performance and design, providing business decision support, and so on.Applying intelligent technologies to Web usage mining, facing Electronic Commerce (EC), this dissertation aims at designing mining models and algorithms under uniform theoretical framework. Through collection, management and analysis of humorous usage data, the hidden patterns and rules are found, which can be used to provide decision support, improve EC Website performance and enhance business safety. This will bring great profits to the enterprises.Based on granular computing and other theories under its uniform framework, such as Rough Sets, Fuzzy Sets and so on, this dissertation focuses on several key techniques and new application areas of Web usage mining. The contribution of the dissertation is as follow:1. A new method of multi-granular user behavior data collection is proposed. The method uses a configurable plug-in embedded into Web Servers to collect user behavior data, which can combine with data of unique EC events and simplify the following pre-processing. It solves the Web log's problems of unreliability, single type and lack of the ability to integrate to other data of EC events. The experimental results show the method proposed is able to collect reliable data at low cost and provide high quality data sources for Web usage mining.2. Some methods are proposed to improve the pre-processing model. A new mixed method of online method and Web log complement is proposed for Web sites topology, so that Web site topology is achieved at most. A new Just Recently Used (JRU) algorithm is proposed. The algorithm uses new heuristic methods to complete the missing pages, which can reduce search space and the results are more reasonable and reliable.3. A knowledge granular based effective and complete attribute reduction algorithm is proposed for high dimension data in Web usage mining. The origin of the inefficiency of existing attribute reduction algorithms is studied and based on theory of granular computing the basic algorithms of indiscernibility relation and positive region computing are designed. Thus a complete and efficient algorithm for attributes reduction is proposed. In these algorithms dynamic SQL is used to directly get the sorted object sets so the sort algorithm and the incremental positive region computing algorithm can be omitted. Five new heuristic strategies are designed to select attributes to avoid useless attributes selected, reduce the search space and simplify intermediate results, which assure the completeness and efficiency of the algorithms. Theoretical analysis and experimental results show that the reduction algorithm proposed is more efficient than the existing ones and more adaptive to very large databases.4. A knowledge granular based high attribute dimensional sparse clustering algorithm framework is proposed. Based on this framework, two clustering algorithms for continuous date and discrete data are designed to analysis user characteristics. Through dimensional threshold vector, dimensional equivalent granular is sought leapingly and data needn't be changed to binary variable. Based on these, Initial equivalence relations are achieved. Then variable precision quadratic clustering model is designed to refine the result so that the algorithm gains noise resistance ability. A new clustering quantity evaluation model is defined facing the application field. The experimental results show the algorithms can provides results of various granular with high veracity and reflect the data characteristics.5. A behavior trust forecast and control model is proposed based on Bayesian network and behavior log mining. Currently, the methods for Web user behavior evaluation are at high cost and lack of feasibility. To solve this problem multiple data are extracted from user behavior logs as trust attributes. Thus the Bayesian network is built and the trust forecast and control algorithm is designed. An improved semi-fuzzy clustering is used to set and adjust the parameters of the model. So that the corresponding relationship is built between quantitive evidence and trust grade. The model can predict trust grade under the multi-trust-attribute conditions. The practical data have shown that multiple performances of the server are enhanced and the trade behaviors of users are restricted.

Keywords/Search Tags:

Web usage mining, Granular computing, Attribute Reduction, Web user clustering, Behavior trust management

PDF Full Text Request

Related items

1	Granular Computing Reduction Method Of SDG Non-accommodating Fault Decision Table And Its Application
2	Research And Application Of Granular Computing Based On Rough Sets In Data Mining
3	The Research Of Rough Set Theory And Granular Computing Crossing Problems
4	Attribute Reduction Based On Granular Computing Algorithm And Applied Research
5	Study On Granularity Clustering
6	Analysis And Proof About Attribute Reduction Of Discernibility Function Based On Granular Computing
7	Research On Mobile User Behavior Pattern Mining And Online Identification Strategy Based On User Trust
8	Study Of Data Mining Based On Rough Set And Granular Computing
9	The Research Of Attribute Reduction Algorithm Based On Rough Set Theory
10	The Application Of Granular Computing In Clustering Analysis