Empirical Analysis Of K-means Algorithm For Internet Industry Stocks

Posted on:2021-10-06

Degree:Master

Type:Thesis

Country:China

Candidate:X L Shi

Full Text:PDF

GTID:2480306107980019

Subject:Master of Applied Statistics

Abstract/Summary:

PDF Full Text Request

High profit means high risk,and it is very important for investors to study which Internet industry stock to invest in.The ultimate goal of this article is to effectively cluster the Internet industry stocks and provide a basis for investors' investment decisions.Based on this,this article obtains data on the main influencing factors of stocks in the Internet industry and adopts similar methods and principles to Zhang Wenming's improvement on k-means clustering algorithm when text clustering,so that it can be applied to the analysis of Internet industry stock data,selection and conclusion.Help investors choose the most suitable stocks in the Internet industry.The k-means algorithm is a typical algorithm in the distance-based clustering algorithm.This algorithm has the advantages of simple operation,wide application,and high compressibility and scalability in the processing of large data sets.However,the algorithm still has some limitations.Because the initial clustering center selected by the k-means clustering algorithm is random,this results in the algorithm being extremely unstable and only reaching the local optimum.This paper deeply analyzes and studies a k-means clustering algorithm proposed by Zhang Wenming based on the combination of density and nearest neighbor.Such an improved algorithm effectively improves the stability and efficiency of the algorithm and reduces the algorithm overhead.The basic idea of improvement can be simply summarized as first analyzing the density of the data set,removing noise and outliers according to the threshold given manually,and then using the nearest neighbor to determine the initial of the improved algorithm based on the initial clustering center calculated from the density Clustering center.The two methods were tested with the iris data set,and the clustering effect of the two algorithms was evaluated by the two evaluation indicators of SSE and Sil.The results show that the clustering effect of the comprehensive algorithm is much better than the traditional k-means clustering algorithm.After cleaning,standardizing,and KMO testing the data,the first five public factors were extracted through factor analysis and respectively named as market factors,potential factors,profit factors,evaluation factors,and risk factors.Internet industry stocks can be divided into four categories through comprehensive algorithms,namely: first-class stocks with certain investment value,second-class stocks without investment value,third-class stocks with low investment value,and the fourth category of investment value.

Keywords/Search Tags:

Internet industry, stocks, k-means algorithm, empirical analysis

PDF Full Text Request

Related items

1	The Empirical Research Of Fama-French Five-factor Model In The Valuation Of Internet Listed Enterprises
2	The Reseach On Dependence Of Interent Finance Concept Stocks Based On Time-varying Copula Model With Wavelet Analysis
3	Linkage Analysis Of Individual Stocks In Mask Industry Under COVID-19
4	The Study And Application Of Some Issues For Cluster Analysis
5	Time Series Analysis Of Industry Index Stocks
6	Research And Application Of Intuitionistic Fuzzy C-Means Clustering Algorithm
7	Applied Research In China's Securities Industry By The Diagnostic Technology About Outliers
8	A Comparative Study Of Internet Enterprise Value Evaluation Based On DDM And EVA Model
9	Missing Data Filling Method And Empirical Analysis
10	Empirical Study Of Tianjin Logistics Industry Development And Economic Growth