Sequential pattern classification without explicit feature extraction

Posted on:2006-09-06

Degree:Ph.D

Type:Thesis

University:State University of New York at Buffalo

Candidate:Lei, Hansheng

Full Text:PDF

GTID:2458390008973873

Subject:Engineering

Abstract/Summary:

Feature selection, representation and extraction are integral to statistical pattern recognition systems. Usually features are represented as vectors that capture expert knowledge of measurable discriminative properties of the classes to be distinguished. The feature selection process entails manual expert involvement and repeated experiments. Automatic feature selection is necessary when (i) expert knowledge is unavailable, (ii) distinguishing features among classes cannot be quantified, or (iii) when a fixed length feature description cannot faithfully reflect all possible variations of the classes as in the case of sequential patterns (e.g. time series data). Automatic feature selection and extraction are also useful when developing pattern recognition systems that are scalable across new sets of classes. For example, an OCR designed with explicit feature selection process for the alphabet of one language usually does not scale to an alphabet of another language.; One approach to avoiding explicit feature selection is to use a (dis)similarity representation instead of a feature vector representation. The training set is represented by a similarity matrix and new objects are classified based on their similarity with samples in the training set. A suitable similarity measure can also be used to increase the classification efficiency of traditional classifiers such as Support Vector Machines (SVMs).; In this thesis we establish new techniques for sequential pattern recognition without explicit feature extraction for applications where: (i) a robust similarity measure exists to distinguish classes and (ii) the classifier (such as SVM) utilizes a similarity measure for both training and evaluation. We investigate the use of similarity measures for applications such as on-line signature verification and on-line handwriting recognition. Paucity of training samples can render the traditional training methods ineffective as in the case of on-line signatures where the number of training samples is rarely greater than 10. We present a new regression measure (ER 2) that can classify multi-dimensional sequential patterns without the need for training with large number of prototypes. We use ER 2 as a preprocessing filter in cases when sufficient training prototypes are available in order to speedup the SVM evaluation. We demonstrate the efficacy of a two stage recognition system by using Principal Component Analysis (PCA) and Recursive Feature Elimination (RFE) in the supervised classification framework of SVM. We present experiments with off-line digit images where the pixels are simply ordered in a predetermined manner to simulate sequential patterns. The Generalized Regression Model (GRM) is described to deal with the unsupervised classification (clustering) of sequential patterns.

Keywords/Search Tags:

Feature, Pattern, Sequential, Classification, Extraction, Training

Related items

1	Constraint-based Sequential Pattern Mining And Its Applications
2	Research On The Sequential Pattern Mining Algorithms Using Prefix-tree Structure
3	Feature Extraction And Pattern Classification Of Electromyographic Signals
4	Research On Keyphrase Extraction Algorithm Based On Frequent Pattern Mining
5	Research On Training Set Construction Method In Pattern Classification
6	Deep Sequential Feature Learning And Recognition Of Signal
7	Sequential Pattern Mining With General Gap And Its Application In Keyphrase Extraction
8	A Research On Feature Selection And Opinion Classification Of Chinese Web Product Comments
9	Image Texture Analysis And Classification Study
10	Application And Research Of Feature Extraction And Pattern Classification On Face Recognition