Font Size: a A A

Research On Automated Feature Engineering Algorithm And System For Structured Data

Posted on:2022-05-06Degree:MasterType:Thesis
Country:ChinaCandidate:X GuoFull Text:PDF
GTID:2518306725993009Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of information technology,more and more industries are actively transforming and upgrading to information technology and intelligence.Compared with the currently popular deep learning,traditional machine learning has more advantages in processing structured data,but model performance and the goodness of features are highly correlated,and manual extraction of features is time-consuming and laborious.Feature extraction often requires data analysts to repeatedly experiment with trial and error,and the process relies on expert experience.In order to shorten the machine learning modeling cycle and improve the efficiency and quality of feature engineering,automated feature engineering has emerged.High-value features are constructed automatically from raw datasets using machines instead of humans to improve the performance of machine learning models and reduce the reliance on complex models.However,existing automated feature engineering methods often ignore the true meaning of the data when constructing new features,and are poorly interpretable and ineffective.In this paper,we firstly propose EAAFE,an automated feature engineering method based on constrained optimization evolutionary algorithm,and secondly,we propose DMAFE,a two-stage automated feature engineering method based on reinforcement learning and meta-learning,and finally,we implement an easy-to-use automated feature engineering system.The main research work and contribution points of this paper include.(1)To address the problem of too many candidate features in existing extended selection methods and feature ranking methods,the automated feature engineering problem is converted into a feature transformation function sequence search problem,where each original feature corresponds to a sequence of feature transformation functions of a certain length.The sequence supports transformation of continuous and discrete features and nesting between feature transformation functions.(2)The thesis proposes EAAFE,an automated feature engineering method based on the constrained optimization evolutionary algorithm,which first encodes the feature transformation functions,then constrains the candidate feature transformation functions according to the feature types to limit the search space,and finally uses the evolutionary algorithm to iteratively search for the optimal feature transformation function sequence.The experimental results show that EAAFE outperforms existing automated feature engineering algorithms under most data sets.(3)The study proposes a two-stage automated feature engineering method DMAFE based on reinforcement learning and meta-learning.to address the problems of slow convergence and low efficiency in existing reinforcement learning automated feature engineering methods,the study proposes a reinforcement learning method using a combination of strategy-valued neural network and monte carlo tree search to model the automated feature engineering problem as a sequential decision problem.A sequence of feature transformation functions is constructed automatically based on the policy-valued deep neural network guiding monte carlo tree search.To further accelerate the search process,the policy value neural network parameters are initialized using a meta-learning approach.(4)Based on the research of EAAFE and DMAFE,we design and implement an automated feature engineering system that supports both of these methods.The ease of use and scalability of the system are improved by designing high-level programming interfaces and modularized docking methods.
Keywords/Search Tags:feature engineering, automated feature engineering, feature transformation function sequence search
PDF Full Text Request
Related items