Font Size: a A A

Research On Fast And Incremental Dimensionality Reduction

Posted on:2018-11-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:T ZhuFull Text:PDF
GTID:1318330542465667Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the developing of science,huge amounts of data are generated rapidly in almost every field,such as Internet,finance,transportation,bioinformatics,astronomy and so on.In many applications,not only the volume of collected data has increased,their dimensions have also become higher.The high-dimensional data present not on-ly opportunities,but also great challenges:On the one hand,directly processing the high-dimensional data is computational expensive.On the other hand,the "curse of the dimensionality" may occur in high-dimensional data analysis.Therefore,how to efficiently and effectively extract useful information from high-dimensional data is an open problem worth studying.As an important tool to address this problem,dimen-sionality reduction has attracted great attention.Dimensionality reduction algorithms aim to project the original data to a lower-dimensional space while preserving the information of interest.To overcome the d-eficiencies of the traditional dimensionality reduction algorithms in fast learning and incremental learning,we conduct a series of research:from linear dimensionality re-duction to manifold learning,from fast dimensionality reduction to incremental dimen-sionality reduction,and propose several new algorithms.The main contributions of this thesis include:1)Making research on linear dimensionality reduction,transforming the problem of linear dimensionality reduction into the problem of basis extraction,and proposing a simple and effective threshold scheme to achieve automatic target dimension estima-tion and high-speed dimensionality reduction.2)Proposing a semi-nonnegative matrix factorization(semi-NMF)algorithm that based on data selection.It allows the elements in coefficient matrix to have mixed signs.By employing the adaptive threshold scheme,the computational efficiency of matrix factorization is improved,the target dimension is automatically determined and the obtained basis matrix is quality-assured.3)Proposing a fast linear dimensionality reduction algorithm named orthogonal component analysis(OCA).OCA achieves low computational complexity orthogonal component(basis)extraction and automatic target dimension estimation while avoiding solving eigenproblem and matrix inverse problem.Furthermore,OCA is guaranteed to be numerical stable.4)Proposing incremental orthogonal component analysis(IOCA)algorithm to extract desired orthogonal basis automatically and rapidly in an online environmen-t.IOCA achieves incremental learning through increasing the dimension of feature subspace in data stream processing.5)Proposing evolutionary orthogonal component analysis(EOCA)algorithm.By adjusting the standard orthogonal basis,EOCA is able to conveniently merge two sub-spaces to a new one.EOCA successfully achieves online subspace learning with low computational complexity.6)Proposing topology learning embedding(TLE)algorithm to achieve fast and incremental nonlinear dimensionality reduction.TLE extracts a small number of rep-resentative nodes from online input data stream and constructs a topology preserving network to approximate data's structure.As the obtained data structure is simplified,the computation load and storage cost of nonlinear dimensionality reduction can be greatly reduced.Moreover,TLE achieves incremental manifold structure learning and out-of-sample embedding.In summary,in this thesis,a series of algorithms are proposed to achieve fast and incremental dimensionality reduction,and the results of the experiments demonstrate that these algorithms are effective and efficient.
Keywords/Search Tags:dimensionality reduction, intrinsic dimension estimation, incremental learning, subspace learning, manifold learning
PDF Full Text Request
Related items