Font Size: a A A

Study On Web Log Mining Based On Artificial Immune System

Posted on:2007-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:J LvFull Text:PDF
GTID:2178360185974877Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
As a rich source of information, Web has gradually entered all aspects of people's study, work and life. With the increasing complexity of Web structures and the increasing hugeness of information, it has become more and more difficult for users to obtain helpful information from Website that doesn't take users'preferences and browsing interests into account. Web server logs are well-structured datasets, which record the condition of users accessing Web pages. Under such conditions, Web log mining technology, which aims at obtaining users'access pattern from interactive information between users and Web, comes into being. Some clustering methods are often used to mine Web logs to analyze users'access patterns in order to find users of similar access interests; and ultimately, to improve Website structures and to realize personalization service.Applying clustering methods to mine Web logs, the present research is intended to find out users'access pattern in a better way, mainly focusing on the following aspects:(1) Researches done on Web log mining technology and on artificial immune system at home and abroad are reviewed. And four preprocessing stages of Web logs are analyzed. Then the basic mechanisms and principles of artificial immune system are introduced.(2) On account of hard partitional clustering algorithms having not taken the fuzziness or the uncertainty of Web logs into account, fuzzy C-Means clustering algorithm is used in this paper to mine Web logs. With user session identity as row and users'accessing page as column, user session matrix can be formed whose element is user access interest degree. After a fuzzy clustering analysis in the matrix, users of similar access preferences can be grouped. Through a further verification, users'common access requirements and access actions can be obtained, which provide foundation for personalization service.(3) Web logs presents high dimensions feature. However, there exist dimensions calamities in conventional clustering algorithms that the number of clustering must be determined in advance whereas in practice it is difficult to achieve so. Against such a background, this paper applies the principles of artificial immune system to mine Web logs. Analogically to the relationship of antibodies and antigens, Web server is regarded as the biology body, and user access requirements as the intruding antigens. After...
Keywords/Search Tags:Web Log Mining, Users'Access Pattern, Artificial Immune System, Clustering Analysis
PDF Full Text Request
Related items