Font Size: a A A

Research And Application Of A Semi-supervised Clustering To Student Dormitory Assignment And Improvement Based On Bayesian Statistics

Posted on:2016-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y YangFull Text:PDF
GTID:2417330542492397Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Semi-supervised clustering analysis is a new research area of data mining and machine learning.In general,the clustering methods need sample data without labels,and it cannot use prior knowledge or satisfy the constraints.However,it is common to see that constraint conditions and available prior knowledge in many clustering problems.For example,in the student dormitory assignment problem,dormitory assignment must conform to the maximum capacity of one dormitory.Also,it can use a priori knowledge to acquire a part of the label for the sample data.In this situation,the traditional clustering and classification methods both cannot meet the requirements of the problem.In this paper,based on the limited number of objects in each cluster,a semi-supervised k-means clustering algorithm is proposed for solving the dormitory assignment problem,and then the matlab programming has been fulfilled.Then,through analyzing the stability and the convergence of clustering results,we get the relationship between expected accuracy of the clustering method and data statistical model.Finally,based on the theory of Bayesian statistical analysis,we introduce the prior knowledge or experience of distribution function by statistical methods,and determine the posterior process according to the prior model.In the experiment on three dimensional Gaussian distribution data set,the accuracy of the semi-supervised k-means clustering based on Bayesian statistics was 93.67%,which is higher than that of k means(91.67%).
Keywords/Search Tags:semi-supervised clustering, student dormitory assignment, k-means clustering, Bayesian statistical analysis
PDF Full Text Request
Related items