Research On High - Dimensional Data Tree Index Based On Soft Subspace

Posted on:2016-01-30

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Shen

Full Text:PDF

GTID:2208330470970581

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

High-dimensional indexing technology is based on the content of a fundamental problem in the field of similarity retrieval research, which has been paid great attention on by researchers for years. In order to reduce the query overhead and accelerate the similarity search in high dimensional space, the commonly used method is to design a high-dimensional indexing to support this type of query. Many researchers work on the indexing technology research,and its main content focuses on the appropriate index dimension reduction and tries to establish a good index method. The clustering algorithm is a commonly used technique for processing huge amounts of data queries and data statistics, and the soft subspace clustering algorithm is an important branch of feature selection and feature conversion problems, which can improve the clustering efficiency of high-dimensional data sets.This paper designed and implemented the high-dimensional data clustering algorithm based on subspace, and on this basis created a kind of indexing tree suitable for subspace clustering. In this paper, firstly the subspace clustering problems were introduced, after the theory of feature selection was studied to improve a sound soft subspace clustering algorithm, and the different clusters hidden in different subspace were found out. Secondly based on the space distribution of the subspace and clustering clusters a high dimensional space partitioning strategy was proposed, and this strategy could help set up the indexing tree structure for subspace clustering in combination with the regional coverage. Finally, in order to improve the query efficiency of high-dimensional data, a filter query algorithm working on the indexing tree structure was put forward.After the above studies, through the experiments on different sizes of artificial experimental data and real data sets, the experimental results proved that the high-dimensional data indexing tree built on the basis of the soft subspace clustering algorithm can improve the query efficiency of high-dimensional data.

Keywords/Search Tags:

High-dimensional data, Soft subspace clustering, Feature selection, Index of the tree

PDF Full Text Request

Related items

1	Feature Selection And Clustering For High-dimensional Data
2	Research On Subspace Clustering Algorithm Guided By Soft Labels
3	Research On Clustering Algorithms For High-Dimensional Data
4	Research Of Subspace-clustering Algorithms Based On Density Over High-dimensional Data
5	Research And Implementation Of Clustering Method For High Dimensional Categorical Data
6	Research On Optimization Of Soft Subspace Clustering Based On Flower Pollination Algorithm
7	The Algorithm Of Soft Subspace Clustering Based On Particle Swarm Optimization
8	Research And Application Of Soft Subspace Clustering Algorithms
9	The Research On Subspace Clustering For High Dimensional Data
10	Research On Feature Selection Algorithms Based On Decision Tree For High-dimensional Data