Font Size: a A A

Multi-modal Person Identification And Application In Humanoid Service Robot

Posted on:2021-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:J J YeFull Text:PDF
GTID:2518306470461664Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
Person identification technology has been applied in people's daily life and has become a hot research direction in the field of human-computer interaction and artificial intelligence.Large-scale identification task in unconstrained conditions is a challenging problem,multi-modal learning combined with the deep learning method solving the problem of large-scale identification,which is a very effective method.The key point is to build a reasonable framework to fuse different modal information,and improve the algorithm of classification ability,this kind of fusion framework can be based on decision layer or feature layer.At the same time,it is necessary to consider how to extract the effective features from different modal information efficiently.In traditional methods,person identification is usually based on some single modal features,such as face features,speech features or fingerprint features.When these traditional methods are applied to the real unconstrained environment,the recognition accuracy is not high enough.Through reading and comparing a large number of domestic and foreign articles and experimental results,it is shown that the single-modal identification algorithm has great limitations when it is applied to different scenes,and the multi-modal information based identification algorithm is better than the single-modal information based identification algorithm in different scenes.Based on the multi-modal information fusion strategy,the proposed person identification algorithm combines face information,head information and voice information under the fusion framework to improve the accuracy and robustness of the algorithm.The research contents of this thesis are as following:1.In terms of the dataset processing,the quality of the dataset directly affects the final result.Data cleaning is performed according to the analysis results of the original dataset.The evaluation model is used to score data and the cleaning principles are formulated to filter the data.Excluding those with little feature information content,and only the data with more feature information should be retained,which can reduce the noise in the dataset and improve the training effect of the model.2.In terms of the feature extraction,it is necessary to extract effective features from face information,head information and voice information.Different face and head detection algorithms are selected to obtain face and head image.The voice information is converted into a single channel 16-bit data stream at the sampling frequency of 44100 Hz,obtains the spectrum through fast Fourier transform(FFT),which doesn't use normalization.Finally,the features are extracted by the neural network of Rest Net.3.In terms of the construction of fusion algorithms,a large-scale multi-modal person identification in real unconstrained environments algorithm is proposed.The results of fusion algorithm based on feature layer and decision layer are compared.The quality score was obtained through the face quality evaluation model and the confidence score was obtained from the human detection model.The two evaluation scores were used to calculate the weight,and the weight was used to construct a fusion strategy that could integrate multi-modal features,so as to improve the robustness and accuracy of the algorithm.4.In order to solve the problem of large-scale classification,a solution that can be applied in the field of person identification is proposed,which divides different intervals with face quality score.Multiple classifiers are trained through sub-datasets,and the output results of multiple classifiers are fused by statistical method.The proposed multi-modal information fusion strategy is tested and compared with single-modal and other multi-modal fusion methods.The open source dataset is used for testing,and the Mean Average Precision(MAP)is used as the evaluation metrics.The obtained accuracy rate is 92.17%,which is 2.47% higher than the state-of-the-art method of multi-modal identification.Finally,the analysis and comparison of each group of experiments show that the proposed fusion strategy using multi-modal information is effective in solving the problem of large-scale person identification in real unconstrained environment.
Keywords/Search Tags:Multi-modal, Information Fusion, Person identification, Deep learning
PDF Full Text Request
Related items