Font Size: a A A

Research On Feature Dimension Reduction Algorithm And Its Application In Stock Price Forecast

Posted on:2021-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:X R XieFull Text:PDF
GTID:2370330611966799Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
With the advent of the information age,data acquisition is more convenient,showing explosive growth in both dimensions and number of samples.Various industries take advantage of the Internet's rapid and convenient advantages to continuously absorb,acquire,and exchange data information.These data information can help people describe and understand things in detail from different angles and in different ways.Problems such as residual data and computational difficulties can easily lead to inaccurate descriptions of information.Although a large number of high-dimensional sample data can bring us more and richer information,how to grasp the key content of the information,how to deal with and discard redundant information is still a problem that needs extensive research,and there is now a processing method that is to reduce the dimension of high-dimensional data.Both linear and nonlinear dimensionality reduction methods have been diverse,and the more widely used,for example,principal component analysis(PCA)algorithm,its advantages are that there are no specific restrictions and it is simple and clear,but the algorithm itself is an unsupervised feature extraction algorithm cannot fully consider the a priori information brought by the label.Secondly,the key step of the algorithm to extract the number of principal elements lacks objectivity.Too much or too little principal information can easily reduce the accuracy of the model,and there are few previous studies on this.In response to the above-mentioned problems,the main research work of this article is as follows:(1)Considering that many studies do not consider the correlation between features and labels before using PCA algorithm for dimensionality reduction,that is,the prior information of the label.This paper proposes to use mutual information(MI)to measure the importance of features to label before the PCA dimensionality reduction,and proposed to divide the feature importance into three parts of weak,medium and strong according to the value of mutual information,filter out the features of the weaker part,and then carry out PCA dimensionality reduction processing.(2)Since the cumulative contribution rate method of selecting the number of principal elements in the PCA algorithm is too subjective,the improved PCA algorithm(IPCA)(3)proposed in this paper uses the average complex correlation coefficient to measure the correlation with the original data when the number of principal elements increases,thus assisting the cumulative contribution rate to jointly judge the number of principal elements.(4)In this paper,the actual stock and index data for a longer time range,and a total of17 factors that affect the stock price,are used to analyze the dimensionality reduction methods before and after the above improvements.The mean square error value of the number of principal elements is obtained,so as to compare the difference before and after PCA improvement,and compare the prediction results after MI-IPCA double dimensionality reduction and IPCA dimensionality reduction to judge the effectiveness of introducing mutual information judgment.
Keywords/Search Tags:mutual information, improved principal component analysis, double dimensionality reduction, neural network prediction
PDF Full Text Request
Related items