Font Size: a A A

Feature Selection Algorithm And Its Research And Application In Causal Discovery

Posted on:2022-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:A B ShenFull Text:PDF
GTID:2518306560955019Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Feature selection method plays an important role in data analysis and dimensionality reduction.At present,many feature selection methods have some common problems,such as a large number of parameter adjustment,long running time,poor feature prediction effect,etc.How to develop a high-performance feature selection method that is suitable for practical use and reduces human intervention is still a challenge.On the other hand,there are many important connections between feature selection and Bayesian network structure learning theory in the field of causal discovery.At present,most causal learning algorithms have the disadvantages of high time complexity and poor accuracy.In addition,they can't be used in the streaming feature environment,continuous data,nonlinear and weakly additive noise data.This paper studies the relevant feature selection method and Bayesian network incremental learning method,analyzes the above existing problems and solves them.The specific work is as follows:First,we combined the Maximum-relevance Minimum-redundancy criterion and specific classifier,puts forward a kind of "Wrapper" feature selection method called FEFS,the method calculates scores for each feature in turn,and then evaluates whether these features can effectively improve the model accuracy through classifier,so as to decide whether to select this feature or not.The algorithm synchronizes the feature selection process with the prediction process,effectively reduces the adjustment process of the algorithm parameters,and enables the model to achieve the best effect under the condition of selecting the least features.Secondly,we designed a computer-aided diagnosis system(CAD)based on FEFS algorithm for the prediction of pulmonary nodules' semantic features.The system uses the digital features of the CT images of pulmonary nodules,effectively eliminates a large number of invalid image features through fast feature scoring calculation and classifier search,and finally outputs semantic feature grades of pulmonary nodules that have important guidance for physicians through model training.Experimental results show that the proposed method has good performance.Then,based on the logarithmic likelihood function,we redefine the relevant feature and redundancy feature,and propose a comparative method to identify the candidate neighbor nodes of the target node from the perspective of the causal graph structure.Based on this recognition method,we propose an algorithm,NLCDSF,to learn causal structure from nonlinear weakly additive noise data,as an incremental learning method of Bayesian network structure,this algorithm analyzes the relationship between features online,quickly identifies the candidate neighbors of the target node by comparison method,and effectively reduces the search scope of the subsequent orientation process.Experiments on simulated data sets show that the proposed algorithm has strong advantages in both accuracy and time.Finally,in order to show the effect of NLCDSF algorithm in real data,we applied the algorithm in the equipment monitoring data of a power plant,and developed a complete set of equipment detection system by combining NLCDSF with long short-term memory.The equipment monitoring data is input into the system online,NLCDSF algorithm is used to learn the causal network diagram between each monitoring point,and the Markov equivalence class of the target measuring point is used as the input of LSTM to predict the fluctuation trend of the target measuring point in the future,so as to provide technical personnel with the analysis and troubleshooting of equipment faults.Experimental results show that the proposed algorithm is effective in actual production environment.
Keywords/Search Tags:Feature Selection, Causal Structure Learning, Streaming Feature, Weakly Additive Noise, Non-linear Data
PDF Full Text Request
Related items