Font Size: a A A

Mixed Sparsity Regularized Multi-view Unsupervised Feature Selection

Posted on:2019-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:J L WanFull Text:PDF
GTID:2428330599450398Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Feature selection is an efficient and effective data preprocessing technique used in machine learning and data mining in reducing high dimensionality of features and reducing computational complexity through the removal of irrelevant and re-dundant features in data hence improving learning accuracy and facilitating a better understanding of the learning model which in turn reduces computational time and costs involved.The objective of feature selection also known as sub-set extraction is to build simpler yet highly comprehensible models that improve performance in data mining while preparing very clean and understandable data.The advent of big data has caused some substantial accumulation of data in huge volumes on daily basis throughout the globe.This accumulated raw data is highly dimensional thus requiring huge storage facilities and somehow rendering known machine learning algorithms ineffective and useless among other serious challenges experienced by modern researchers in machine learning and data mining.Data can be collected from diverse backgrounds or modalities and described from multiple views.It has been found to be extremely challenging to reduce feature dimensionality of data in multiple views when dealing with unsupervised circumstances as opposed to supervised cases due to its inherent nature of multiple views and high dimensionality.Relationships characterization in multiple views has proved to be particularly difficult when dealing with unsupervised features selection in multi-view.In this thesis,a novel method for multi-view unsupervised feature selection which imposes sparsity on views and individual features is pro-posed.The importance of views is taken into consideration without necessarily introducing explicit weights of views in complementary information exploitation.Seven diverse publicly available datasets with each containing at least three different views were benchmarked for comparison in experimentation.During parameter setting,the size of neighborhood k is set as 5 on all datasets for all the comparison methods.The grid-search strategy used in parameter tuning of ?1 and ?2 which controls the sparsity the involved matrix.The number of features dimensions is set with all experiments repeated and average results over different dimensions reported.Experiments on benchmark datasets show the superiority of mixed sparsity regularized multi-view unsupervised feature selection(MSMFS)algorithm as it outperforms other state-of-the-art unsupervised feature selection techniques.
Keywords/Search Tags:Unsupervised feature selection, group sparsity, Curse of Dimensionality, Embedded Methods, Filter Methods, parameter-free learning
PDF Full Text Request
Related items