Font Size: a A A

Research And Application Of Manifold-Learning Algorithms

Posted on:2008-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:H L ZhuFull Text:PDF
GTID:2178360245491819Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the information age, people often need to face massive data to process, and this large amount of data is still increasing in a geometrical rate. However, redundancy often exists in the massive data. So how to effectively process these data, find the internal laws and effectively reduce the volume of the data and extract the hidden information become one of the core issues in artificial intelligence,machine learning,data mining and other fields. The manifold-learning algorithm can effectively find the internal dimension of the high-dimensional data, discard the dross and discover the essential of the high-dimensional data, and the processing efficiency of the massive information can be enhanced. This paper focuses on the rapid manifold-learning algorithm, which is applicable to the massive data.The mainstream manifold-learning algorithm can be divided into two classes– linear and nonlinear. The earlier manifold-learning algorithms are linear algorithms represented by PCA, the implementation of this category of algorithms is simple, but it is only suitable for the linear data sets. The nonlinear manifold-learning algorithms represented by Isomap,LLE can effectively discover the manifold in nonlinear data, but these manifold-learning algorithms generally have high time complexity, not suitable for processing massive data sets. The embedded algorithm Anchor points based Isometric Embedding under least square error criterion (AIE) has a time complexity of O ( nlog(n)), and after obtained geodesic distances it has linear time complexity for embedded points and can be fully realized in parallel, so AIE can effectively improve the processing speed of massive data.Traditional search engine technologies mainly rely on the user's input inquiry to provide the search results, so in the situation of shorten and ambiguous term the fields of the users'demands can not be accurately grasped by this method, and the quality of the search results is lowered. Query expand system based on clickthrough data can estimate the requirements of the users in real-time through the capture of the user's click action. And by adopting AIE to compress the hidden webpage differences information in clickthrough data sets, it can significantly reduce the space costs when search engine calls the webpage differences information.
Keywords/Search Tags:Manifold-learning, Search engine, Clickthrough data
PDF Full Text Request
Related items