Font Size: a A A

Financial Enterprise Portrait Label Prediction Based On Incomplete Multi-View

Posted on:2024-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:J CaoFull Text:PDF
GTID:2568307127453604Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the financial field,addressing the issue of missing data is a prerequisite for data mining.The existence of missing data limits the study and processing of data.Currently,existing research on missing data imputation mainly focuses on single-view datasets,and there are still shortcomings in the imputation methods for multi-view missing data.This paper proposes a weighted multi-view random forest imputation algorithm for missing data in financial enterprises,and further research is conducted on the application of missing data imputation.The main research content and innovation points of this paper are as follows:1.The aim of this paper is to investigate the issues associated with multi-view data and propose a novel approach to address the problem of missing data in multi-view datasets using a random forest imputation model.The proposed method combines the ideas of multi-view learning and ensemble learning,and constructs a random forest imputation model that introduces feature importance as the weight for the multi-view data,in order to effectively fill in the missing labels in multi-view data.The research method and complexity of the multiview missing data imputation task are analyzed,and the approach of multi-view ensemble is taken as the research focus.By introducing the idea of ensemble learning into the existing random forest imputation model,a filling result that integrates information from various views is obtained through the strategy of multi-view weighted ensemble.Experimental results demonstrate that the proposed method has good applicability for multi-view data,and performs well on real multi-view datasets from financial enterprises and public multi-view datasets.Compared with traditional random forest imputation algorithms,the proposed method reduces the average error in missing data imputation by 1.6%,and can more effectively fill in missing data in multi-view datasets.2.This paper develops a multi-view missing data imputation system using the proposed multi-view imputation model.The system integrates various imputation solutions for multiview missing data,including missing data simulation,model imputation,data analysis,and imputation recommendation,to construct a complete missing data imputation workflow.Specifically,the missing data simulation function preprocesses the uploaded data files and generates a missing data set with various missing data features to provide a comparison target for the imputation model.The model imputation function includes both single-view and multi-view imputation models,which are integrated into the system to impute missing data from different views and to obtain the imputation errors by comparing the original data.The data analysis function generates visualizations to help users understand the imputation errors of different imputation models under different missing data situations.The imputation recommendation function analyzes the visualization results and selects the optimal imputation model for different missing features in the dataset to provide the best imputation solution for users and achieve the best imputation results.Additionally,functional and performance tests are conducted on the main modules of the multi-view missing data imputation system to confirm its effectiveness.In summary,this paper proposes an effective method for filling missing data in multiview datasets,which can improve the filling performance of multi-view data.Further research indicates that incorporating feature importance as view weights in the random forest filling model can effectively reduce the filling error of multi-view data,which has important enlightening significance for multi-view data missing filling tasks.In addition,the multi-view filling integration task system designed in this paper provides integrated filling models for different views,which can realize filling testing and recommend the optimal filling method based on the filling results,thus obtaining the optimal filling data,and has practical application and value.
Keywords/Search Tags:Missing data imputation, Random forest, Ensemble learning, Multi-view learning, System development
PDF Full Text Request
Related items