Font Size: a A A

Clustering Analysis Of Mixed Data With Periodic Nominal Variables

Posted on:2022-08-26Degree:MasterType:Thesis
Country:ChinaCandidate:W YanFull Text:PDF
GTID:2518306335984359Subject:Statistics
Abstract/Summary:PDF Full Text Request
In today's information age,a single type of data contained in the information have already can't satisfy the actual needs of various industries,the actual management,operation efficiency is not high,and with 5g technology gradually into People's Daily lives,digital and intelligent way of life will be a new normal,dealing with the amount of data in each field will be bigger and bigger,more complex data structure.For example,the financial and economic,biomedical and mobile communications in areas such as containing a large number of observation by numeric variables and by observations of nominal variables or attributes of mixed data,in addition to the familiar numeric data for exploration,mining of attribute data contains information also become vital,namely who can more fully understand customer's information and the real needs of customers,the clustering analysis of mixed data has become the current popular research topic.In the complicated mixed data,the observation of some nominal variables also has some periodicity.For example,the periodic nominal variable season has the value of "spring","summer","autumn" and "winter",respectively.despite the current classification of mixed data is of great practical value application direction,but this kind of mixed data with periodic nominal variable on the basis of the similarity characterizations and clustering analysis method is not very mature.This paper describes the lack of similarity between samples in terms of the distance between the observations of periodic nominal variables in mixed data.A new method is proposed to quantify the observations of periodic nominal variables in order to improve the deficiency of similarity measurement among the observed values of periodic nominal variables.Then the formula for the distance between the observed values of the periodic nominal variable is given,Combining with the distance formula between the numerical variable observations,a new distance formula between the mixed data observations with periodic nominal variables is proposed to measure the similarity between the samples,and on this basis to define new distance to depict the similarity between samples.Finally,experiments on real data sets,the experimental results show that the presented to depict observation between quantitative method and the new distance formula of similarity more reasonable,so that the mixed data clustering has higher accuracy,put forward the method to make up for past for the shortage of this kind of data classification have certain academic value and practical significance.
Keywords/Search Tags:Distance, Attribute variable, Mixed data, Periodic nominal variable
PDF Full Text Request
Related items