Font Size: a A A

An information-theoretic perspective for learning systems with engineering applications

Posted on:1997-10-31Degree:Ph.DType:Thesis
University:University of FloridaCandidate:Wang, ChuanFull Text:PDF
GTID:2468390014984322Subject:Electrical engineering
Abstract/Summary:
The major goal of this study is aimed at building a unifying perspective for most learning systems (adaptive filters and neural networks). A detailed analysis of the adaptation rules is presented from the point of view of generalized correlation learning. The analysis also reveals that learning in recurrent networks is equivalent to learning with second-order correlation with different time lags, which highlights why recurrent systems extract time information. It is well known that supervised systems can be used either for static learning (functional mapping MLPs) or temporal learning (time delay neural networks or the Gamma model). But in unsupervised learning, almost all neural networks are trained statistically due to the absence of a teacher signal. Therefore, a unified perspective of temporal supervised and unsupervised learning requires a mathematical extension to unsupervised learning. The focus of extending static unsupervised systems to temporal learning will be made with the Principal Components Analysis (PCA) network. PCA is one of the dominant networks in the unsupervised family and it is based on the Hebbian rule which plays, by itself a fundamental role for unsupervised learning. PCA in time is examined in detail. It is shown that PCA in time gives a set of adaptive time-varying orthogonal basis ordered by variance which constitute the signal sub-space. The relationships between PCA in time, Fourier analysis, and wavelets are also pointed out. An application to subspace adaptive filtering is outlined which decreases significantly the training time. Then, as an application of the PCA concepts to time processing, a neural topology to compute the crosscorrelation and autocorrelation on-line is proposed. The algorithm exploits the unifying perspective developed for the learning rules based on correlation learning. This network is then used for blind sources separation, which is a difficult problem because the solution must estimate the transfer function of a linear system based on the outputs of the system alone. We then turn to the other goal of the thesis--to propose a unified perspective for both supervised and unsupervised learning. A simple but poorly understood relationship between supervised and unsupervised learning is revealed. It is shown that when the desired signal is a zero mean noise, the supervised learning is statistically equivalent to unsupervised learning. This result combined with the knowledge of autoassociative learning provides a basis to present a perspective for learning from the point of view of information theory. The main theoretical conclusion of the thesis can be outlined as: In a supervised learning system, when the mutual information between the input and the desired signal reaches its extreme (maximum or minimum) the learning degenerates into an unsupervised paradigm. With this perspective, the classification of learning in supervised or unsupervised is not only based on the existence of a desired signal but must also take into consideration the relationship between the external signals.
Keywords/Search Tags:Perspective, Systems, Unsupervised learning, Desired signal, PCA, Information, Time
Related items