Research On Automated Feature Engineering Algorithms For Classification Problems Of Numerical Features

Posted on:2021-11-22

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Cai

Full Text:PDF

GTID:2518306476952959

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In recent years,automated machine learning has become a new sub-field of machine learning.Every step of machine learning can be developed in the direction of automation.Among them,feature engineering is one of the difficulties in applying AI in industry,and the quality of features is the foundation of the subsequent learning models.Since the raw features rarely lead to satisfactory results,it is often necessary to perform manual feature generation to better represent the data and improve learning performance.However,this is usually tedious and task-specific work,which inspires research work related to automated feature generation.Most of the early work of automated feature generation focused on generating features through a combination of strictly pre-defined methods,making the method less scalable;later,deep learning methods based on implicit learning of higher-order feature interactions appeared,but the model lacked interpretability.To this end,we propose an automated feature construction framework Tide Kit,which can learn the high-order interactions of input features automatically,and is widely used in classification problems with numerical features and has good model interpretability.The main work of the thesis is as follows:(1)In terms of feature generation,we propose a new feature combination method based on the self-attention mechanism,which is specifically implemented in the interaction layer of the model.For each interaction layer,higher-order features are combined through the attention mechanism,and different kinds of combinations can be evaluated using the self-attention score,so the learning process is interpretable.By stacking multiple interactive layers,the different sequences of the combined raw features can be modeled,and the process is fully automated.(2)In terms of feature selection,we propose a novel feature selection method based on reinforcement learning.The feature selection process is transformed into a Markov Decision Process(MDP).Evaluate the candidate probability of each feature in parallel based on policy gradient,through iterative exploration and utilization of the generated features,within a limited number of steps to guide the feature generation of the test set with the globally optimal feature generation and selection scheme.In addition,we propose a new method based on meta-features for hot start and individual reward differentiation,and establish a dynamic automated adjustment mechanism,thus optimizing the iteration efficiency.We performed extensive experiments on eight real-world datasets.The experimental results show that our proposed method is not only better than the latest prediction methods,but also has good model interpretability.In addition,the dynamic auto-adjustment mechanism provides better convergence for the model.

Keywords/Search Tags:

Numerical features, Automated feature engineering, Self-attention, Reinforcement learning

PDF Full Text Request

Related items

1	Research On Automated Feature Engineering Algorithm And System For Structured Data
2	Automatic Feature Engineering In Supervised Learning
3	Application Of Automatic Feature Engineering Based Representation Learning For Categorical Features
4	Research Of Automated Negotiation Based On Reinforcement Learning
5	Automatic Feature Engineering System For Tabular Data
6	Research On Multi-Issue Automated Negotiation Based On Agent Reinforcement Learning
7	Research On The Sparse Reward Problem Based On Hierarchical Reinforcement Learning
8	Research On Group Confrontation Strategies Based On Deep Reinforcement Learning
9	Product Domain Relation Extraction Based On Reinforcement Learning And Attention Mechanism Denoising
10	Research On Automatic Feature Engineering Algorithms For Classification Problems Of Categorical Features