Person Recognition Based On Audio-Visual Information With Multi-Level Fusion Under Smart Room

Posted on:2015-01-07

Degree:Doctor

Type:Dissertation

Country:China

Candidate:D Wu

Full Text:PDF

GTID:1268330428481228

Subject:Control theory and control engineering

Abstract/Summary:

PDF Full Text Request

Rencent years, with the gradually improvement of the safety requirement and the fastly development of the remote video conference system. The person recognition technology based on biometrics is become the research focus in pattern recognition areas, it is used in smart video Internet of thing, public security, financial services and video conference system and many other fields widely. The accuracy of the single biometric person recognition is limited affected by the data noise and the limitation of the recognition system itself. In order to solve this problem, researchers try to fuse the visual information and audio information using information fusion technique that is visual-audio multi-biomrtric person recognition to improve the recognition accuracy has been received intensively attention. But now the visual-audio multi-biomrtric person recognition research is mainly confined to single biometric recognition in ideal condition and fusing based on existing fusion methods simply, they are few consideration for the effective extraction of the single biometric feature, the structure of the high precise and universal recognition algorithm and optimal fusion methods. From the apparent auditory cognitive mechanism of the people, the paper studied the visual-audio multi-biometric person recognition problem from three aspects:feature extraction, recognition algorithm and fusion method. In order to provide a workable solution scheme for the visual-audio multi-biometric recognition under smart room, the main work and the innovation points of the paper are standing as follows:1.Achieved the effective extraction of the face feature and voice feature under complex environments.First, extract the most effective DCT coefficients as recognition features is the key step to face feature extraction problem, from the angle of selecting the most effective features, this paper presents the DCT coefficient selection method according to Discriminant Power Analysis, and to extract the DCT coefficient which have the larger discriminant power values. At second, we put the Hair feature to be used in face recognition based on its geometrical features and color features in order to extending the diversity of the face features. At last, by means of emulating human auditory, Gammatone Filter Cepstral Coefficients is given out based on Gammatone Filter banks models, in view of the Gammatone Filter Cepstral Coefficients only reflect the static properties, the Gammatone Filter Shifted Delta Cepstral Coefficients is extracted based on Shifted Delta Cepstral. 2.Two face recognition algorithms which can solve the small sample problem are proposed.In order to solve the problem of the lower recognition accuracy and worse robustness of face recognition under smart environment. Two new recognition algorithms called Kernel Relevance Weighted Discriminant Analysis (KRWDA) based on relevance weighted discriminant analysis and kernel discriminate local preserve projection(KDLPP) based on discriminate local preserve projection algorithmis is proposed which using kernel trick.3. The Gaussian Mixture Model modeling problem of speaker recognition are proposed.The performance of Gaussian Mixture Model(GMM) declines rapidly when the length of the training data is reduced under different unexpected noise environment, a adaptive Gaussian Mixture Model is proposed in this paper.The adaptive process for each GMM model with sufficient training data is transformed to the shift factor based on Factor Analysis, when the training data is insufficient, the coordinate of the shift factor is learned from the GMM mixtures of insensitive to the training data and then it is adapted to compensate other GMM mixtures. At the second,in order to enhance the recognition performance of the i-vector speaker recognition system under unpredicted noise environment, a improved local preserve projection algorithm which used for reduce dimension to i-vector is proposed on this paper.4.Optimal fusion rule is established at different levels of audio and visual featuresEstablished optimal fusion rule is the difficulty of fusion recognition, from now on, there is no omnipotent fusion strategy which can be used all of the actual situation. This thesis sets optimal matching layer fusion rules and solves the conflict between evidences bodies based on information entropy, probability density method and decision sciences. Based on the analysis of existing methods to solve the problem of conflict evidence, this thesis proposes evidences combination rule which based on group decision and multi-criteria choice fusion, which can effectively solve the conflict problem of evidence. Next. in order to avoid estimating the match fraction density, this thesis lends the total error probability into matching layer fusion to estimate match fraction density, at the same time, the uncertainty measurement fusion is introduced into multi-feature fusion recognition, and then the optimal weights of weighted sum rule can obtained based on Gaussian density which is applied to logistic regression sort fusion layer. Last, in order to solve the match fraction density fusion probability, FAR and FRR were lead into to solve confidence function and fuse confidence function based on triangle mold operator which can avoid calculates the weights of sum rule.In summary, the research contents improve the computer’s ability to understand the complex information and the processing capabilities of heterogeneous information, and further expand the applicable conditions and applications of the integration of multi-biological identification, effectively improve the robust identification and recognition rate with the multi-feature fusion based audio and video features under smart the environment, it is important significance to promote the development of human-computer interaction technology...

Keywords/Search Tags:

Smart Environment, Person Recognition, Fusion Criteria, FaceRecognition, Speaker Recognition

PDF Full Text Request

Related items

1	Multi-speaker Recognition Based On Audio-video Feature Fusion In Smart Environment
2	Studies On Speaker Recognition Based On SVM And GMM
3	Research On Speaker Representation Based On MG Training Criteria
4	Research On Multi-person Speech Recognition Based On Deep Learning
5	Speaker Recognition Based On Multi-information Fusion
6	Research Of Speaker Recognition Technology Based On Fusion Features
7	Research Of Speaker Recognition In Low-SNR Environment
8	Research On The Discrimination Issue In Speaker Recognition
9	Research On Improvement Of Speaker Recognition Algorithms Based On SONAR Platform
10	Speaker Recognition Research In Noisy Environment