ESG is a composite framework composed of Environment,Society and corporate Governance.It has gradually become the focus and new value consideration for enterprises by government organs,financial institutions,enterprises and investors.By carrying out ESG rating,it is helpful to clarify the impact of ESG on enterprise benefits and the weak links in which enterprises perform,so as to promote enterprises to deepen reform and improve the ability of sustainable development.At the same time,it is beneficial to upgrade green sustainable development from one-way transmission to two-way transmission,promoting more market subjects to actively participate in the construction of ESG,promoting the sound and healthy development of ESG concept in our country.However,due to the structural characteristics of ESG itself,it covers a wide range of indicators,a large span,difficult to obtain,most of the indicators are only disclosed in text reports.And the number of companies investing in the market is countless.It is a huge challenge to calculate the ESG rating of each enterprise only by manpower cost.At present,the data mining technology,machine learning,ensemble learning or other algorithm theories are rarely known in the field of ESG.Based on this,the idea of infiltration of machine learning model fusion theory into ESG field can lay a theoretical foundation and broaden the research road for subsequent research on ESG rating,and also provide a new applicable scenario for machine learning and integrated learning algorithm.At the same time,in view of deficiencies in the Stacking fusion theory,weighted summing based on precision is proposed in the cross-verification process of the base learner of the Stacking model,and feature selection of LightGBM is added to the training framework of the second layer meta-learner,which can also help further improve the Stacking theory in future studies.In the processing of ESG data set,this paper adopts adaptive oversampling algorithm to optimize the unbalance of the data set,and uses random forest interpolation method to fill in the missing value variables.Finally,after data set verification,it is found that the improved Stacking model fusion algorithm based on precision weight and feature selection has the best effect compared with single model prediction,traditional Stacking model prediction and Blending model prediction,which verifies the feasibility of improving Stacking algorithm and its applicability in the ESG field.In terms of the selection of base learners,selecting two models as the base learners performs better than selecting three or more models for the Stacking fusion model,and selecting a model with good single fitting effect of individual learners is better than selecting weak learners,for example,the combination of LightGBM model and Xgboost strategy.For weak learning,only a relatively strong learner can be used to make the fusion model greatly improved compared with the weak learner.In terms of the number of original features selected by the meta-learner,the model selected with 1 feature variable is superior to the model selected with 3 and 32 feature variables,indicating that the model fitting effect presents a decreasing trend with the increasing number of features.From the perspective of precision weighting,the fitting effect of the weighted method that assigns high weights according to high precision is better than that of the weighted method that assigns low weights to high precision models. |