Font Size: a A A

Learning-driven Probability Reconstruction And Data Visualization

Posted on:2020-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:X H LiaoFull Text:PDF
GTID:2370330620959946Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In the era of information explosion,processing and analyzing large-scale and high-dimensional data sets has become a big challenge for data mining and machine learning.In order to obtain and intuitively understand the information underlying the big data,an effective data visualization technique is on demand.Data visualization technology can display complex high-dimensional data sets on low-dimensional graphs.According to low-dimensional data charts,we can intuitively catch on the structure of data,and which will also greatly benefit data exploration and pattern recognition.On the basis of some traditional data visualization algorithms,the main research works of this paper are as follows:(1)A fast nearest neighbor search algorithm based on ANNOY algorithm is proposed.This algorithm searches the nearest neighbor points by the method of neighbor expansion on the basis of random projection tree,which greatly improves the speed of nearest neighbor search on the basis of guaranteeing the accuracy.(2)A probability reconstruction algorithm based on neighborhood relationship and category information is proposed.This algorithm firstly constructs neighborhood graph by the method of neighbor expansion,and then reconstructs the probability between the original data samples based on the neighborhood relationship and category information.The algorithm can describe the similarity between high-dimensional data samples more accurately.(3)A visualization algorithm based on P-BGLL is proposed.On the basis of BGLL algorithm,P-BGLL algorithm is proposed by using probabilities as the weights between samples in space,and then visualization operation is carried out based on P-BGLL algorithm.Compared with the traditional data visualization algorithm,the visualization algorithm based on P-BGLL can better retain the global and local structure of high-dimensional data,and obtain good visualization results.(4)A visualization algorithm based on Feature-Net is proposed.Firstly,this algorithm uses the Feature-Net network model to extract data features,which can filter out noise and redundant information,and then reconstructs the probability using the category information and neighborhood relationship.The algorithm can describe the similarity between samples more accurately,and its visualization performance is more excellent.
Keywords/Search Tags:nearest neighbor search, probability reconstruction, data visualization, convolutional neural network, clustering
PDF Full Text Request
Related items