Font Size: a A A

K-means Clustering Algorithm And Its Application In The College Library Web Log Mining

Posted on:2011-09-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y L KangFull Text:PDF
GTID:2178360305967490Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
Nowdays, people are using the Internet, which can leave a lot of valuable information for analysis, along with the popularization of the network. Facing an increasingly large information base, how do we find a useful knowledge not easily found, which has become an important research topic. We can solve this problem, by mining the user access log records and useing the Web log mining technical.Accoding to the characteristics of library user access to, We mine the Web log of college library by the method of clustering. K-means clustering algorithm select the initial cluster centers is random, which can abate the accuracy. This paper proposed a improved K-means algorithm—IKM, combined the method of grid. This algorithm has a greater improvement in accuracy and robust of the cluster.During Web log mining, designed and implemented a visual log mining software. This tool can be used to generate a vector table of data input, and count the results of clustering mining.Finally, construct I-Weka mining system with the improved K-means clustering algorithm. Through the Java development platform, we add the IKM algorithm into the Weka system. We can mine the data preprocessed with clustering, using the improver I-Weka. Analysing the final results can draw what kind of books are the users interested in, find what kind of books are in a relatively high degree of concern or what kind of book collections have the incomplete phenomenon, and provide reference for purchasing books to the procurement department of college library, which can use the funds reasonable, improve collection strcture and to upgrade the library service quality.
Keywords/Search Tags:Web log mining, Clustering, K-means algorithm, Library
PDF Full Text Request
Related items