Font Size: a A A

Automatic Feature Learning Methods Based On Neural Networks

Posted on:2019-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y YangFull Text:PDF
GTID:2348330545476685Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the explosive growth in the volume of data,it becomes particularly impor-tant to use machine learning models to mine valuable information from massive data.A great deal of experience suggests that input features are essential for training an ef-fective machine learning model.In the field of feature extraction,traditional artificial feature engineering occupies a dominant position for a long time.However,artifi-cial feature engineering is time-consuming and requires expert knowledge.In recent years,it has been widely studied to use machine learning models to learn features automatically.Neural networks are one of the most popular machine learning model-s,which achieve better performances than traditional artificial feature engineering in many tasks,such as computer vision,voice recognition and natural language process-ing.In this paper,we propose two feature learning methods based on neural networks.The main contributions are:1.A Shapelet Learning Method based on SOINN for Time Series.In the first work,we focus on the time series data.In this case,we consider the situation where the training dataset is small.Since deep learning models require massive training data,it is inappropriate to use deep neural networks to extract features for small datasets.Therefore,we leverage an unsupervised self-organized incre-mental neural network(SOINN)to learn a lot of local patterns from the time series data.A feature selection method is then applied to select useful features from these local patterns for the target task.Experimental results on real datasets show that our method is much faster than the related methods,and the classifica-tion accuracy is also outstanding.2.Operation-aware Embedding Method for Categorical Features.In the sec-ond work,we focus on the multi-field categorical data.In this case,we consider the situation where the training dataset is very large.Considering that categorical features are high-dimensional after one-hot encoding in many datasets,directly feeding them into deep neural network introduces too many network parameter-s.Existing methods usually embed the categorical feature as a low dimensional vector representation.However,in a network with multiple types of operations,an embedded representation needs to be involved in multiple operations,which makes it hard to learn a good feature representation.To address this limitation,we propose a new feature embedding method,named the operation-aware em-bedding.Our proposed method learns the most suitable feature representations for each feature under different operation contexts.Experimental results show that our method outperforms other methods and achieves the state-of-the-art per-formance on two real click through/conversion rate prediction tasks.
Keywords/Search Tags:Neural Networks, Feature Learning, Time Series, Computational Advertising
PDF Full Text Request
Related items