Research Of Database Access Log Based On Weka

Posted on:2013-05-05

Degree:Master

Type:Thesis

Country:China

Candidate:A L Fan

Full Text:PDF

GTID:2268330425982743

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

PDF Full Text Request

With the accelerated pace of social information, the database of human precipitation of large amounts of data, how to extract the implied, unknown and potentially useful information with data mining technology has become the research focus.This paper described Weka, which is an open data mining platform and collects lots of machine learning algorithms which are able to take data mining tasks. And did a detailed study of database access log data preprocessing, cluster analysis methods and carrying out the mining of University Library lending information Log. Package CV-k-meansâ€”k-means clustering algorithm based on coefficient of variation to Weka platform, then analyze the lending information of library by using the improved clustering algorithm and dig out the implicit knowledge in order to provide reference data for the library purchasing department. The main content and contribution of this paper id focused in the following aspects:1. The performance of k-means clustering algorithm depends on the selection of distance metrics. The Euclid distance is commonly chosen as the similarity measure in k-means clustering algorithm, which treats all features equally and dose not accurately reflect the dissimilarity among samples. k-means clustering algorithm based on coefficient of variation (CV-k-means) is proposed in this paper to solve this problem. The CV-k-means clustering algorithm uses variation coefficient weight vector to decrease the affects of irrelevant features. The experimental results show that the proposed algorithm can generate better clustering results than k-means algorithm.2. As an open data mining platform, Weka collects lots of machine learning algorithms which are able to take data mining tasks. However, the real world problem which to be solved become complicated and variety of data mining algorithms are showing limitations. In this paper, running Weka on the CV-k-means algorithm to get the new personalized data mining platform, in order to ease the contradiction between the general-purpose data mining tools and areas of expertise in mining.3. Analyze the lending information of Dalian Polytechnic University by using the improved personalized Weka data mining platform. Make CLC as a cluster object, three properties lend times, renew times and average lending time to participate in cluster computing to mining the pretreatment data, after analyze the clustering results we get the degree of reader interest, then provide appropriate recommendations for the procurement for the library.

Keywords/Search Tags:

Data Mining, Cluster analysis, k-means, coefficient of variation, Weka

PDF Full Text Request

Related items

1	The Study And Improvement Of Fuzzy C-means Cluster Algorithm
2	Research And Practice Of Network Teaching Data Analysis Based On Weka Platform
3	Improvement And Application Of K-means Algorithm
4	Application Of Data Mining Techniques In Acupuncture In Prescription Corrpatibility
5	Research And Application Of Data Mining In In-Surance Customer Data
6	The Research And Application Of Improved K-Means Algorithm In Data Mining
7	The Research Of K-means Clustering Algorithm In Data Mining
8	Application Of Data Mining Tools In Disease Risk Factors And Treatment Methods
9	Research And Implementation Of Data Mining Based Education Analysis System
10	Research And Application Of Campus Card Consumption Based On Data Mining