Font Size: a A A

High-dimensional Data Clustering Algorithm And Its Application In Health Management

Posted on:2021-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:T YaoFull Text:PDF
GTID:2428330647450582Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and computer science,data collection,transmission,and storage have become more convenient,and the construction of data platforms has developed rapidly under this technical condition.The resulting new forms of data have emerged endlessly,and the data dimensions have also evolved from the previous few dimensions to thousands of dimensions.The increase in data dimensions has brought huge challenges to data analysis and processing.Clustering,as an important technology in data analysis,aims to classify the data set into categories according to its characteristics,and to accurately process individuals in each category.This technology has important applications in health management.Relevant data is clustered to perform more personalized and accurate health management for individuals in each category.However,there are many factors related to health,and traditional clustering algorithms such as K-Means are inefficient or even completely ineffective when processing such high-dimensional data.This paper studies clustering algorithms and applications for high-dimensional data.1)A K-Means clustering model and algorithm(Reg.K-Means algorithm)based on SCAD function regularization is proposed.The algorithm can effectively remove redundant dimensional information and retain it.Really valuable information,and make the cluster structure more accurate and stable;2)The simulation algorithm is used to verify the computational efficiency and effectiveness of the clustering algorithm,and based on this,an empirical analysis is performed using real health management data.Specifically,we selected the health census data of middle-aged and elderly people as the original data source,and applied algorithms to classify disease prediction.We use K-Means algorithm,Sparse K-Means algorithm,and Reg.K-Means algorithm to compare and analyze the clustering results,which fully shows the efficiency of the algorithm and explains how to use the clustering results for accurate health management.
Keywords/Search Tags:SCAD, high dimensional data, health management, dimension punishment
PDF Full Text Request
Related items