| With the continuous development of emerging technologies in the industrial fields,industrial machinery and equipment shows a trend of complexity and precision.As one of the main components of machinery and equipment,the frequency of fault of rolling bearings increases with years of service.Because of various types and complex structure of rolling bearings,it is difficult to match the fault phenomenon and the cause of fault in the actual fault diagnosis process.In order to solve this problem,fault diagnosis techniques based on machine learning have been widely researched and applied,of which decision tree algorithm is one of the successful techniques in the field of fault diagnosis.Although decision tree algorithm has been successfully applied,it still exists some problems such as inaccurate diagnosis results,high time cost and low diagnosis efficiency.In this thesis,it analyzes the problem of multicollinearity among several attributes and the limitations of the traditional decision tree algorithm,and proposes the multivariate logistic regression analysis algorithm and the C4.5 decision tree algorithm based on the Taylor Series.In the process of building the decision tree algorithm model,there is multicollinearity among multiple fault attributes,which causes inaccurate fault classification results.In this thesis,K-S nonparametric normal analysis and Spearman algorithm correlation test are firstly used.The former compares the original cumulative theoretical frequencies with the observed empirical frequency distribution to find the deviation between them to determine whether the data follows a certain distribution.The latter introduces the matrix of ranks for data correlation analysis of fault attributes,and also incorporates the VIF factor for diagnosis of multicollinearity relationships between fault independent variables.Finally,a multivariate logistic regression analysis algorithm is proposed,and the remaining significant attributes after screening are used to construct the rolling bearing decision tree fault diagnosis model.For the traditional decision tree C4.5 algorithm with high time cost and low diagnostic efficiency,this thesis introduces the Taylor Series to participate in the decision tree model function operation.The formula of information entropy,information gain and information gain ratio in the traditional decision tree C4.5 algorithm are improved by optimizing the decision tree model function through the logarithmic function Taylor Series polynomial.The relevant logarithmic formula operations in the decision tree model are transformed into basic four operations on finite terms,and finally the improved decision tree model is constructed according to the magnitude of the information gain ratio.This thesis uses the bearing dataset released by the National Standardization Administration to design multiple sets of comparison experiments to verify its effectiveness.The results of the comparison experiments based on attribute screening show that the multiple logistic regression analysis algorithm is superior in fault diagnosis performance,with shorter running time for fault diagnosis results,improved accuracy in model construction,and reduced multicollinearity problems between multiple attributes.The comparison experimental results of improved decision tree algorithm show that the introduction of the C4.5 algorithm with Taylor Series improves the classification efficiency of the decision tree model and reduces the model diagnosis time complexity. |