| With fast development of information technology,more and more people are beginning to pay attention to data information and strive to seek valuable data through data analysis.Chart visualization technology has become an effective tool for data analysis,which can reveal data information intuitively and efficiently.However,using professional tool to create visualization requires a series of relatively complex actions such as selecting chart types,axes,and data attributes which is a continuously explore process leading to the decrease of analysis efficiency.In addition,current visualization tools require certain programming ability,that is,there is a high threshold for the use of visualization tools.In order to reduce the difficulty of data visualization and improve the efficiency of visualization creation,a data-driven visualization recommendation algorithm was proposed.This thesis mainly investigated the recommendation of multi-dimensional data visualization charts,and applied the proposed algorithm to high-speed railway malfunction data.Aiming at the problems of rule redundancy and worse expansion in rule-based visualization chart recommendation system,a data-driven visualization chart recommendation algorithm was proposed,which realized the prediction of visualization types through extracting data features of corresponding raw data sets,constructing feature data sets,and training the recommendation model.A data balancing algorithm based on class decomposition-generating adversarial network was proposed to solve the problem of data imbalance in the process of chart visualization recommendation.Although data sampling technology can assuage the problem of data imbalance,there still exist some shortcomings in the quality of generated samples and retention the original data information.To ensure the effectiveness of the recommendation results,aiming at the imbalance of sample categories in the feature data sets,this thesis decomposes the multi class samples and synthesizes the few class samples.K-means clustering algorithm was used to decompose multiclass samples into disjoint clusters to reduce its dominant effect in data classification,and avoided the loss of data information caused by conventional under-sampling.Generating adversarial network was adopt to synthesize samples,making the generated samples conform to the data distribution and possess a certain diversity.Considering the high dimensionality of the feature data set,the random forest was employed to predict chart type and recommend appropriate charts to users,reducing the difficulty and time consumption in data visualization,and improving the efficiency and accuracy of data analysis.In order to further verify the effectiveness of the proposed data-driven multi-dimensional data visualization chart recommendation algorithm,the research content was applied to analysis the malfunction data of the high-speed railway and realized the data visualization of high-speed railway malfunction data.Based on the malfunction data of high-speed railway and business requirements,the data model was established,and appropriate charts were recommended to visually and clearly display the malfunction data information of high-speed railway so as to help the staff of high-speed railway quickly and accurately identify the malfunction information leading to the improvement of the maintenance efficiency.The project of data visualization assisted the user in making accurate decisions,and provided effective intelligent tools for the safe operation of high-speed railway. |