The stock market is an intricate system in which many factors affect stock returns.How to scientifically and effectively choose the right stocks for investment is a hot topic,and also one of the important research directions in the field of financial investment.The key of stock classification lies in the selection of stock features and the determination of classification model.Stock classification is to establish some kind of mapping relationship between financial index and stock return,and then use this mapping to predict future stock return.However,as the stock price phenomenon is a nonlinear system,the traditional research method of stock investment classification has many shortcomings,and it needs to adress the challenging problems,such as learning ability and the curse of dimensionality.In the field of machine learning and data mining,Support Vector Machine(SVM)and Artificial Neural Network(ANN)are two classical nonlinear classification models,which have been widely studied and applied.However,in practical applications,their generalization ability and learning efficiency are often affected by the quality of data,and it is difficult to achieve the expected goal.For data such as stock financial indices,which contains many redundant features,irrelevant features and even noise features,related research shows that the efficiency and prediction accuracy of classifier can be improved by effective feature selection.Based on this,in order to improve the generalization ability of SVM and neural network and the efficiency of model training,it is necessary to firstly study the problem of feature selection for stock financial indices,and select the indices or factors that have significant influence on stock classification.Based on this,the accuracy and efficiency of stock investment classification are studied.We selecte the A-share of the historical data of 5672 companies in Shanghai stock exchange from 2013 to 2017 in this article,take the stock return rate as the output variable,and adopt a variety of feature selection methods to eliminate irrelevant and redundant indicators,18 financial indicators that have significant influence on stock investment classification are selected to construct the sample data set,while retaining asmuch original data information as possible,spatial dimension reduction is realized and data quality is improved.Then,taking dichotomy(positive or negative stock return rate)as an example,the model is built by SVM and BP(backward propagation)neural network,learning the nonlinear mapping relationship between these important indicators and stocks from the data,constructing the SVM and BP neural network classification model with the good learning ability of its machine learning algorithm,and using its generalization ability to predict the future earnings of the stocks of interest.On this basis,the prediction performance of two classifiers SVM and BP neural network is compared and analyzed for stock classification investment.The results of horizontal and vertical comparison show that the Relief feature selection works best;The Support Vector Machine model has better accuracy than the BP Neural Network model based on feature selection. |