The Research On Unsupervised Feature Selection In Multiple Scenarios

Posted on:2024-07-27

Degree:Master

Type:Thesis

Country:China

Candidate:J Yu

Full Text:PDF

GTID:2530307085498584

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

With the rapid development of computer and information technology,the ability of data collection and extraction has been continuously improved,which has led to the complexity and diversity of data sources and manifestations.Data can be divided into single-view and multi-view data according to the source scenario.In addition,the data obtained in practice is usually high-dimensional.Highdimensional data brings a lot of information to the research task,but the noise and redundant information in the data will also affect the performance of data processing and analysis.An effective method to reduce the dimension of data is feature selection,which has been widely used in machine learning and data mining in recent years.At the same time,because it is difficult to obtain data labels in practice,and the cost of manual labeling is relatively high,many of the data obtained are unlabeled.Unsupervised learning can solve the problem of feature selection without using label information.Therefore,this paper studies unsupervised feature selection in single-view and multi-view scenarios respectively.The main research work and results are as follows:(1)In the research of single-view unsupervised feature selection method,we propose an unsupervised feature selection model based on spectral clustering and feature redundancy minimization.This model integrates feature selection,spectral clustering and feature redundancy minimization into a unified framework,which can select features with discriminance and low redundancy at the same time.Specifically,the model learns the geometric structure information of the data through spectral clustering and introduces the obtained clustering structure into the orthogonal regression model,so that the model retains more discrimination information in the low-dimensional space.At the same time,the model can effectively remove redundant features by minimizing the loss of feature redundancy.In addition,we propose an alternate iteration method to solve the objective function based on the existing optimization algorithm,and demonstrate the convergence and time complexity of the model.Relevant experimental results show that our proposed algorithm is significantly superior to other single-view unsupervised feature selection algorithms.(2)In the research of multi-view unsupervised feature selection method,we propose a novel method: adaptive hypergraph learning for multi-view unsupervised feature selection.This method decomposes the target matrix of projection space into orthogonal basis and cluster indicator matrix through orthogonal decomposition,and combines the information of different views to learn the consensus matrix.In order to avoid the influence of noise and outliers in the original data and preserve the high-order local geometric structure between samples,we introduce adaptive hypergraph learning into the model,and combine the similarity structure of hypergraph with consensus matrix,which effectively utilizes the complementarity and consistency information of multi-view data.In addition,we design an alternative optimization algorithm to solve the proposed objective function and analyze the convergence and time complexity of the algorithm.The experimental results on different datasets show the effectiveness of our proposed algorithm.(3)Apply the proposed feature selection algorithm to the selection of credit evaluation indicators.With the rapid development of credit consumption in modern society,financial institutions need to build personal credit evaluation models to analyze the credit status of loan applicants.In order to remove irrelevant and redundant features in credit data and improve the performance of credit evaluation models,we use the feature selection method proposed in this paper to select credit evaluation indicators,and verify the proposed method on the publicly available Australian Credit Dataset.Compared with other feature selection methods,the experimental results show that the proposed method has better performance.

Keywords/Search Tags:

Unsupervised feature selection, Multi-view data, Minimize feature redundancy, Hypergraph learning

PDF Full Text Request

Related items

1	Research On Unsupervised Feature Selection Method Based On Adaptive Hypergraph
2	Hypergraph Low-rank Feature Selection Multi-target Regression Algorithm
3	Research On Unsupervised Feature Selection Algorithm Based On Graph
4	Research On Feature Selection And Feature Subset Redundancy For Gene Expression Data
5	Unsupervised Feature Selection Based On Graph Regularization
6	Researches On Unsupervised Feature Selection Algorithms Based On Sparse Regression And Manifold Learning
7	An Improved Multi-cluster Unsupervised Feature Selection Method And Its Application In The Detection Of Construction Land Change
8	Research On Feature Engineering And Feature Selection Algorithm Of Biogenetic Data Based On CNN
9	ECG Biometric Recognition Based On Feature Learning And Multi-feature Fusion
10	Feature Selection And Classification Based On Terahertz Time-Domain Spectral Data