| Enterprise risk prediction research plays a vital role in the survival and development of enterprises,helping enterprises to find potential risk problems during early operation and prevent them from happening.In enterprise risk forecasting research,early rule-based methods complete risk forecasting by analyzing critical values of features,with low accuracy.Then,with the application of machine learning models,the accuracy of enterprise risk prediction has been greatly improved by training a large amount of data,but this method strongly relies on feature engineering.However,there are many problems in the feature classification and existence of the feature dimension in the data,and it is difficult to use a unified feature engineering solution to deal with it:firstly,the existence of highly correlated continuous features reduces the distinguishing ability of features,secondly,the information expression ability of native features is poor,and there are many for the text features that appear,the word meaning and position information are in a symmetrical form at the word granularity,which makes it difficult to use the symmetrical text features.On the other hand,historical data is accumulating over time,and existing time series models are difficult to deal with long time series data due to the problem of information disappearance.For the above two issues,this paper has carried out the following research work:(1)This paper proposes an enterprise risk prediction model based on the fusion of symmetric text and structured features.Firstly,a feature extraction model with symmetrical position and meaning in word granularity is proposed.According to the text features in financial statements,the word embedding two-dimensional similarity matrix of sentence pairs is used as the model input,and then dynamic pooling and convolution network model are carried out.and other operations to judge the symmetric text and obtain the symmetric text vector in it.For continuous and discrete features,corresponding feature engineering methods such as feature screening,feature construction,feature coding,and feature scaling are used to strengthen the relationship between features and the ability to represent them.Finally,the above vectors are integrated to prove the effectiveness of the work in the evaluation and experimental comparison of different models.(2)This paper proposes a long-time dependent sequence improvement algorithm based on wide-depth self-attention mechanism.Among them,the wide-depth self-attention mechanism module can enhance the model’s ability to represent cross-features,generalize to discrete sparse features,and extract text features of "time-series in time series".In order to solve the problem that the information of the model disappears in long time series,the long time series Informer model is used to complete the long series prediction task and the risk prediction task respectively.Combining the above features and characterization methods,it can more accurately determine enterprise risks and predict enterprise risk objectives in the future. |