Research On ST Risk Prediction Of Listed Companies Based On SVC Stacking Ensemble Model

Posted on:2023-03-21

Degree:Master

Type:Thesis

Country:China

Candidate:Q Q Xu

Full Text:PDF

GTID:2530306806969669

Subject:Applied Statistics

Abstract/Summary:

PDF Full Text Request

Special Treatment is a unique system due to company’s abnormal financial situation or others abnormal in China.In theory,Special Treatment warns investors the stocks have high risks which continued investment may face huge losses.However,ST shares have become a hot plate under the asset restructuring and shell resources speculation.Even if there are a very few speculators profit from it,the vast majority of investors often suffer heavy losses.In order to strengthen the protection of investors’ rights and severely crack down on these illegal securities activities,the CSRC launched the reform of the delisting rules.With the reform of the registration system,the regular delisting mechanism has been established and the formation of a survival of the fittest mechanism in the capital market has accelerated.Investors heavily focus on the Special Treatment of listed companies.The deterioration of the listed companies’ financial situation is often a process of gradual evolution.If the ST risk can be predicted according to some information,it could reduce the investment blindness and the company’s business decisions could be adjusted to avoid the occurrence of ST.This thesis proposes to construct an ensemble model to predict the ST risk faced by listed companies from the perspective of protecting investors’ rights and interests.The commonly used ST risk prediction models include Discriminant Analysis,Logistic Regression,Machine Learning and Neural Network.Although the Discriminant Analysis is easy to calculate,it has strict assumption conditions.Logistic Regression uses the maximum likelihood function for parameter estimation which required the sample size.At present,Machine Learning methods are widely used to unmanned driving,potential crime prediction,detect credit card fraud and other fields with well-behaved.Therefore,this thesis uses the Machine Learning method to predict ST risk.There are three common ensemble learning algorithm: Bagging,Boosting and Stacking.Based on Stacking multi-model fusion,this thesis combines Support Vector Machine,Logistic Regression,Decision Tree and Naive Bayes to improve the performance of ST risk prediction.In the selection of samples,domestic scholars usually regard ST enterprises as enterprises in financial distress.This thesis considers that ST is not equal to Financial Distress because Financial Distress is measured in terms of cash flow while Special Treatment is assessed in terms of profitability.Therefore,the sample selected in this thesis is 150 ST enterprises and 450 normal enterprises of A-share listed companies from 2018 to 2021.A total of 49 predictors were selected from financial factors and non-financial factors to collect the operating data of the sample companies in the first six years from the CSMAR.Outliers were found for the sample data in the descriptive statistics.After data normalization,the filter is used to eliminate the features with zero variance,considering that features with small variance cannot be used to distinguish samples.Then the principal components are extracted,which reduces the influence of collinearity on the model as well as reduces the loss of information contained in the original index.Based on the principles of SVC algorithm,this thesis trains the ST risk model and uses grid search and cross-validation to output the optimal ST risk prediction model.When evaluating the classification effect of each model,the accuracy,precision,recall,F1-score and AUC were mainly selected.The experimental results show that the sample data is nonlinear and unbalanced.Under the kernel of sigmoid and the class＿weight of 1∶3,the SVC model has high accuracy(88.89%)and recall(90.48%).Comparing the SVC models under different lead time data,the prediction performance of t-1,t-2 and accumulated for two years of sample data sets is better.Using the idea of Stacking multi-model fusion,Logistic Regression,Decision Tree and Naive Bayes models are used as the first base learners of Stacking algorithm,and ST risk prediction model based on SVC is used as the second base learners to build the ensemble model.In 20 consecutive experiments under the optimal hyperparameter combination and three-fold cross-validation,the results showed that the accuracy of ensemble model averaged89.75% and the recall averaged 90.25%,which is about 7.53% and 2.31% higher compared with the single SVC model,and the fluctuations of the evaluation indicators of the ensemble model are minimal.It is not difficult to see that the ensemble model can capture more than90% of the ST enterprises,while the prediction performance and the stability are better than a single model.Therefore,this thesis believes that the ensemble model has a high practical value in the ST risk prediction of listed companies.

Keywords/Search Tags:

Special Treatment, Ensemble Learning, Support Vector Machine, Stacking

PDF Full Text Request

Related items

1	A Study For Ensemble Learning Based On SVM
2	Support Vector Machines Classifier Based On Margin Vectors
3	Research On Prediction Of Phosphorylation Modification Sites Based On Machine Learning
4	Prediction Research Of Protein-Protein Interaction Based On Ensemble Of Support Vector Machine And Random Forest
5	Support Vector Machine Data Classification
6	Generalization Performance Of Support Vector Machine Distributed Ensemble Learning Based On Markov Sampling
7	Predict Type Ⅲ Effector Protein And Antioxidant Protein Based On Machine Learning
8	Study Of Algorithms For Support Vector Machine
9	Support Vector Machine Based On Artificial Error
10	Research On Underwater Image Recognition Based On Deep Support Vector Machine