Learning-driven Probability Reconstruction And Data Visualization

Posted on:2020-03-28

Degree:Master

Type:Thesis

Country:China

Candidate:X H Liao

Full Text:PDF

GTID:2370330620959946

Subject:Control Science and Engineering

Abstract/Summary:

In the era of information explosion,processing and analyzing large-scale and high-dimensional data sets has become a big challenge for data mining and machine learning.In order to obtain and intuitively understand the information underlying the big data,an effective data visualization technique is on demand.Data visualization technology can display complex high-dimensional data sets on low-dimensional graphs.According to low-dimensional data charts,we can intuitively catch on the structure of data,and which will also greatly benefit data exploration and pattern recognition.On the basis of some traditional data visualization algorithms,the main research works of this paper are as follows:(1)A fast nearest neighbor search algorithm based on ANNOY algorithm is proposed.This algorithm searches the nearest neighbor points by the method of neighbor expansion on the basis of random projection tree,which greatly improves the speed of nearest neighbor search on the basis of guaranteeing the accuracy.(2)A probability reconstruction algorithm based on neighborhood relationship and category information is proposed.This algorithm firstly constructs neighborhood graph by the method of neighbor expansion,and then reconstructs the probability between the original data samples based on the neighborhood relationship and category information.The algorithm can describe the similarity between high-dimensional data samples more accurately.(3)A visualization algorithm based on P-BGLL is proposed.On the basis of BGLL algorithm,P-BGLL algorithm is proposed by using probabilities as the weights between samples in space,and then visualization operation is carried out based on P-BGLL algorithm.Compared with the traditional data visualization algorithm,the visualization algorithm based on P-BGLL can better retain the global and local structure of high-dimensional data,and obtain good visualization results.(4)A visualization algorithm based on Feature-Net is proposed.Firstly,this algorithm uses the Feature-Net network model to extract data features,which can filter out noise and redundant information,and then reconstructs the probability using the category information and neighborhood relationship.The algorithm can describe the similarity between samples more accurately,and its visualization performance is more excellent.

Keywords/Search Tags:

nearest neighbor search, probability reconstruction, data visualization, convolutional neural network, clustering

Related items

1	Recognition Of Essential Proteins Based On Improved Edge Clustering Coefficient And K-nearest Neighbor Algorithm
2	Efficient Clustering Algorithm For Large-Scale Single-Cell Transcriptome Data
3	Heart Sounds Collection And The Abnormal Detection Based On Convolutional Neural Network
4	Research And Application Of Key Technologies For Complex Vector Search Based On Nearest Neighbor Graph
5	Application Of Nearest Neighbor Clustering And MCP In K- Arm DNA Computing
6	Stuyd Of Data-driven Convolutional Neural Network For Tomographic Reconstruction Of Electical Capacitance
7	Research On Spectral Reflectance Reconstruction Algorithm For Color Reproductio
8	Research On Clustering Method And Semi-supervised Method Based On Hybrid K-nearest-neighbor Graph
9	Research On Electromagnetic Inverse Scattering Reconstruction Algorithm Based On Convolutional Neural Network
10	Research Of Classification Algorithm Based On K Nearest Neighbor