Research On Automated Feature Engineering Algorithm And System For Structured Data

Posted on:2022-05-06

Degree:Master

Type:Thesis

Country:China

Candidate:X Guo

Full Text:PDF

GTID:2518306725993009

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of information technology,more and more industries are actively transforming and upgrading to information technology and intelligence.Compared with the currently popular deep learning,traditional machine learning has more advantages in processing structured data,but model performance and the goodness of features are highly correlated,and manual extraction of features is time-consuming and laborious.Feature extraction often requires data analysts to repeatedly experiment with trial and error,and the process relies on expert experience.In order to shorten the machine learning modeling cycle and improve the efficiency and quality of feature engineering,automated feature engineering has emerged.High-value features are constructed automatically from raw datasets using machines instead of humans to improve the performance of machine learning models and reduce the reliance on complex models.However,existing automated feature engineering methods often ignore the true meaning of the data when constructing new features,and are poorly interpretable and ineffective.In this paper,we firstly propose EAAFE,an automated feature engineering method based on constrained optimization evolutionary algorithm,and secondly,we propose DMAFE,a two-stage automated feature engineering method based on reinforcement learning and meta-learning,and finally,we implement an easy-to-use automated feature engineering system.The main research work and contribution points of this paper include.(1)To address the problem of too many candidate features in existing extended selection methods and feature ranking methods,the automated feature engineering problem is converted into a feature transformation function sequence search problem,where each original feature corresponds to a sequence of feature transformation functions of a certain length.The sequence supports transformation of continuous and discrete features and nesting between feature transformation functions.(2)The thesis proposes EAAFE,an automated feature engineering method based on the constrained optimization evolutionary algorithm,which first encodes the feature transformation functions,then constrains the candidate feature transformation functions according to the feature types to limit the search space,and finally uses the evolutionary algorithm to iteratively search for the optimal feature transformation function sequence.The experimental results show that EAAFE outperforms existing automated feature engineering algorithms under most data sets.(3)The study proposes a two-stage automated feature engineering method DMAFE based on reinforcement learning and meta-learning.to address the problems of slow convergence and low efficiency in existing reinforcement learning automated feature engineering methods,the study proposes a reinforcement learning method using a combination of strategy-valued neural network and monte carlo tree search to model the automated feature engineering problem as a sequential decision problem.A sequence of feature transformation functions is constructed automatically based on the policy-valued deep neural network guiding monte carlo tree search.To further accelerate the search process,the policy value neural network parameters are initialized using a meta-learning approach.(4)Based on the research of EAAFE and DMAFE,we design and implement an automated feature engineering system that supports both of these methods.The ease of use and scalability of the system are improved by designing high-level programming interfaces and modularized docking methods.

Keywords/Search Tags:

feature engineering, automated feature engineering, feature transformation function sequence search

PDF Full Text Request

Related items

1	Research On Auxiliary System Of Engineering Drawing Based On Feature Recognition
2	Research On Automatic Feature Engineering Algorithms For Classification Problems Of Categorical Features
3	Research On Automated Feature Engineering Algorithms For Classification Problems Of Numerical Features
4	Study On Feature Technology In Reverse Engineering CAD Modeling
5	Study On Segmentation And Constraint-based Feature Reconstruction In Reverse Engineering
6	Study On Key Modeling Techniques Based On Section Feature In Reverse Engineering
7	Design And Implementation Of Intelligent Feature Engineering Platform For Telecommunication Data
8	Application Of Automatic Feature Engineering Based Representation Learning For Categorical Features
9	Research On Feature Line And Surface Extraction In Reverse Engineering
10	Feature Model Instance Mining For Software Product Line Engineering