Font Size: a A A

Multi-improved Methods Of Density Clustering And Their Application In Stock Valuation

Posted on:2023-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:H H WangFull Text:PDF
GTID:2568306839463954Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of science and technology,human society has gradually transitioned from the industrialization era to the informationization era,and mining the hidden value in information has become the focus of research in academia.As an important tool for data mining,clustering technology can obtain valuable information from massive data,so it has become a hot research topic in the field of artificial intelligence nowadays.Currently,the density peaks clustering(DPC)algorithm and density-based spatial clustering of applications with noise(DBSCAN)algorithm,as unsupervised learning algorithms,occupy an important position in the clustering techniques.However,both algorithms still have some limitations in terms of clustering performance:(1)The DPC algorithm does not accurately identify anomalies and always classifies them into clusters;(2)The DPC algorithm performs poorly when dealing with non-convex datasets;(3)The DPC algorithm requires manual selection of clustering centers,which negatively affects the objectivity and accuracy of the algorithm;(4)The DBSCAN algorithm requires manual selection of the distance threshold Eps,which reduces the accuracy of the clustering results.To address the above problems,this paper proposes corresponding improvement schemes and applies the improved algorithms to stock valuation:(1)To address the problem that the density peaks clustering algorithm cannot accurately identify anomalies and has poor clustering effect on non-convex datasets,a novel Gini index-based density peaks clustering(GIDPC)algorithm is proposed.This algorithm introduces the principle of Gini index idea to redefine the calculation method of the local-density of data points,which provides a better decision graph for the identification of clustering centers,and at the same time sets a reasonable threshold for anomaly identification and the way of anomaly distribution.Achieving better processing of datasets containing various data distribution patterns.However,the algorithm inherits the drawback that the DPC algorithm requires manual selection of clustering centers.(2)To address the problem that density-based spatial clustering of applications with noise algorithm needs to manually select the distance threshold Eps,a novel adaptive density-based spatial clustering of applications with noise based on bird swarm optimization(BSA-DBSCAN)algorithm is proposed.This algorithm adaptively scans the parameter space by using the global search ability of the bird swarm optimization algorithm,and outputs the optimal distance threshold Eps to realize the self-adaptation of parameter selection.(3)To address the problem that the GIDPC algorithm requires manual selection of clustering centers,the two-stage integrated automatic clustering(TSIAC)algorithm is proposed.The algorithm utilizes the capability of the BSA-DBSCAN algorithm to automatically obtain the true number of clusters in the dataset when appropriate parameter values are entered,providing the basis for the GIDPC algorithm to select the cluster centers.At the same time,the automatic cluster center selection mechanism is introduced to output the clustering results without manually selecting the clustering centers,realizing the automatic clustering of the algorithm.(4)Finally,the application areas of the improved algorithms in this paper are extended,and the improved algorithms are applied to the stock growth capacity valuation.The experimental results show that the improved algorithms can more accurately cluster stocks according to the stock characteristics to evaluate the stock situation,which provide valuable references for stock investors and have some potential in practical applications.
Keywords/Search Tags:Density peaks clustering, Density-based spatial clustering of applications with noise, Anomaly identification, Distance threshold Eps, Stock growth capacity valuation
PDF Full Text Request
Related items