Various local governments have set up semiconductor funds in succession in order to promote the development of China’s semiconductor industry.The original intention of semiconductor investment is to promote the rapid development of this industry in China,but it does not work in fact.Blind and repeated investment in some subject matter and overheated investment in some areas in China have pushed up the valuation of semiconductor.It is obvious that there are some problems and pain points in domestic semiconductor investment.This paper takes a global view of semiconductor industry to diagnose the problems in China and from the perspective of industrial chain,try to predict the rise and fall of global semiconductor stocks via machine learning,so as to solve the domestic semiconductor investment problem via investing foreign semiconductor stocks.The problem to be solved in this paper is that which stocks to be chosen and how to choose them if part of a fund is invested in the semiconductor industry.This paper starts with the analysis of semiconductor industry and selects the annual data of 52 indicators of all stocks in the global semiconductor industry from 2008 to 2018 for empirical research.After data preprocessing and feature selection,data from 2008 to 2014 were split as the training set and data from 2015 to 2018 were split as the test set.The classification label was the rise and fall of the stock in the next year.The MLP neural network is built to predict the rise and fall of stocks while XGBoost algorithm and random forest algorithm are taken as comparison.The models are optimized by adjusting parameters and feature selection.The models are evaluated and selected from three aspects: classification ability,generalization ability and algorithm efficiency.At last,the optimized models were tested by rolling back test.The data of every three years were split as training set,and the data of the nest year were split as test set.A stock portfolio was formed based on the 30 stocks with the highest probability of rise predicted for the next year in each period.The empirical result shows that the growth rate of country risk,operating income growth,dividend rate for 12 months and market value/R&D expend have deeper impact on stock yield next year.And the MLP neural network,XGBoost and random forest are all valuable on stock price forecasting,and MLP neural network is the best one among the three models.The average precision of MLP neural network was 0.64,the average recall was 0.60,the average f1-score was 0.61,the AUC is 0.67,and the training and testing time of the whole model was only 0.065 second.In terms of back testing performance,MLP neural network reached a negative return in 2018,and reached positive return in other 6 years.It surpassed the global semiconductor benchmark except in 2014 and showed obvious advantages compared with the domestic semiconductor benchmark.In terms of risk,its Sharpe ratio is 0.9511 and the distribution of portfolio yield is platykurtic.In the respect of practical significance,the selected stocks are safer and have lower P/E than domestic ones and help to avoid irrational investments. |