Multi-modal Person Identification And Application In Humanoid Service Robot

Posted on:2021-01-15

Degree:Master

Type:Thesis

Country:China

Candidate:J J Ye

Full Text:PDF

GTID:2518306470461664

Subject:Mechanical engineering

Abstract/Summary:

PDF Full Text Request

Person identification technology has been applied in people’s daily life and has become a hot research direction in the field of human-computer interaction and artificial intelligence.Large-scale identification task in unconstrained conditions is a challenging problem,multi-modal learning combined with the deep learning method solving the problem of large-scale identification,which is a very effective method.The key point is to build a reasonable framework to fuse different modal information,and improve the algorithm of classification ability,this kind of fusion framework can be based on decision layer or feature layer.At the same time,it is necessary to consider how to extract the effective features from different modal information efficiently.In traditional methods,person identification is usually based on some single modal features,such as face features,speech features or fingerprint features.When these traditional methods are applied to the real unconstrained environment,the recognition accuracy is not high enough.Through reading and comparing a large number of domestic and foreign articles and experimental results,it is shown that the single-modal identification algorithm has great limitations when it is applied to different scenes,and the multi-modal information based identification algorithm is better than the single-modal information based identification algorithm in different scenes.Based on the multi-modal information fusion strategy,the proposed person identification algorithm combines face information,head information and voice information under the fusion framework to improve the accuracy and robustness of the algorithm.The research contents of this thesis are as following:1.In terms of the dataset processing,the quality of the dataset directly affects the final result.Data cleaning is performed according to the analysis results of the original dataset.The evaluation model is used to score data and the cleaning principles are formulated to filter the data.Excluding those with little feature information content,and only the data with more feature information should be retained,which can reduce the noise in the dataset and improve the training effect of the model.2.In terms of the feature extraction,it is necessary to extract effective features from face information,head information and voice information.Different face and head detection algorithms are selected to obtain face and head image.The voice information is converted into a single channel 16-bit data stream at the sampling frequency of 44100 Hz,obtains the spectrum through fast Fourier transform(FFT),which doesn’t use normalization.Finally,the features are extracted by the neural network of Rest Net.3.In terms of the construction of fusion algorithms,a large-scale multi-modal person identification in real unconstrained environments algorithm is proposed.The results of fusion algorithm based on feature layer and decision layer are compared.The quality score was obtained through the face quality evaluation model and the confidence score was obtained from the human detection model.The two evaluation scores were used to calculate the weight,and the weight was used to construct a fusion strategy that could integrate multi-modal features,so as to improve the robustness and accuracy of the algorithm.4.In order to solve the problem of large-scale classification,a solution that can be applied in the field of person identification is proposed,which divides different intervals with face quality score.Multiple classifiers are trained through sub-datasets,and the output results of multiple classifiers are fused by statistical method.The proposed multi-modal information fusion strategy is tested and compared with single-modal and other multi-modal fusion methods.The open source dataset is used for testing,and the Mean Average Precision(MAP)is used as the evaluation metrics.The obtained accuracy rate is 92.17%,which is 2.47% higher than the state-of-the-art method of multi-modal identification.Finally,the analysis and comparison of each group of experiments show that the proposed fusion strategy using multi-modal information is effective in solving the problem of large-scale person identification in real unconstrained environment.

Keywords/Search Tags:

Multi-modal, Information Fusion, Person identification, Deep learning

PDF Full Text Request

Related items

1	Person Re-Identification Based On Multi-cue Information Fusion
2	Research On Person Re-identification Based On Learning Multi-modal Feature Representations
3	Design And Implementation Of Person Re-Identification System Based On Deep Learning
4	Algorithms Research For Single-modal And Cross-modal Person Re-identification
5	Cross-modal Person Re-identification Based On Deep Learning
6	Research On Person Re-identification Algorithm Based On Deep Learning
7	Research On Learning Adaptive Ranking Functions And Deep Features For Person Search
8	Multi-modal Person Re-Identification Based On Fine-grained Feature Fusion
9	Research On Person Re-Identification Based On Cross-Modal Data
10	Research On Person Re-identification Technology In Non-overlapping Views