As environmental problems occur worldwide,electric vehicles have become a priority.State of Health(SOH)is an essential parameter for characterizing battery aging,which is vital for companies and users to ensure accurate estimation in practical applications.Traditional methods based on laboratory conditions do not fully reflect the state of the battery in real-life scenarios.At the same time,the EV Big Data Monitoring Cloud Platform is a product of the combination of the automotive industry and the Big Data,making it possible to SOH study in actual vehicles by analyzing parameters collected from real vehicles.This paper explores the real-world vehicle data and discusses the pre-processing of the original data,the mining of the battery aging health factor,and the establishment of the model,which has a specific value in the practical engineering application.The main work is as follows:(1)The research methods of SOH are classified,the shortcomings of SOH estimation research based on laboratory conditions are summarized,the current status of SOH research based on real-world vehicle data is analyzed,and the research framework of this paper is determined.(2)The reasons for the poor quality of real-world vehicle data are summarized,and solutions are provided.The duplicate values of the original data are first removed and reordered,and then the missing data are analyzed utilizing a missing matrix and a tree diagram.Four filling algorithms are compared,and the optimal filling model is selected for the data application.Finally,data outliers are categorized and identified using methods such as box plots and anomaly detection based on Gaussian probability distributions.Different fields are interrogated concerning the actual vehicle situation to improve the raw data quality.A preliminary capacity calculation is carried out by selecting a deep charging segment.The reference capacity label is obtained by smoothing the primary calculated capacity curve using the generalized additive model,preparing the data for subsequent processing.(3)To address the problem of the shallow level of health factor extraction,this paper combines the factors affecting the aging of the real-world vehicle battery to mine relevant health factors.Firstly,the cumulative driving mileage and equivalent cycle number under the time perspective are selected as health factors.Then the charging time for small voltage intervals under different segments is extracted as a new health factor.Frequency of user behavior as a health factor to characterize user habits.The health factor describing the inconsistency correlation is extracted by analyzing the relationship between the extreme cell voltage and the total voltage.The health factors are evaluated using the Spearman correlation coefficient.Factor parameters with low correlation scores are selected by the Bayesian optimization algorithm to obtain factors of higher quality,thus preparing the model for subsequent input.(4)An ELM model is established to estimate the capacity.The effects of the model are compared before and after factor optimization.The improvement is made to address the problem of poor accuracy due to random initialization of the ELM parameters.A particle swarm optimization algorithm is used to optimize the weights and biases of the ELM,and the relevant initial parameters are selected.The improved ELM estimation model is then developed.The model is verified from the perspective of different feature inputs,different algorithm comparisons,and different vehicle applications respectively.The method presented in this paper can provide a reasonable estimation of real-world vehicles’ battery SOH and provides an estimation framework for the study of battery SOH based on real-world vehicle data. |