Font Size: a A A

Research On Text Person Re-identification Based On Convolutional Neural Network

Posted on:2022-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:M M YangFull Text:PDF
GTID:2518306539453234Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Recently,with the increase of public security demand,video surveillance has been widely utilized,which generates a huge amount of video monitoring data.However,it is difficult to manually search for criminal suspects from such large-scale video data.Therefore,text-based person re-identification has great application value in video surveillance,telephone alarm,suspect search,and other fields.Text-based person re-identification needs to overcome modal heterogeneity,that is,the huge difference between text and image information.With the continuous maturity of convolutional neural networks,this direction has been developed rapidly.In this subject,the following two research contents are developed with convolutional neural networks as the core.The specific research contents are as follows:(1)Aggregated Squeeze-and-Excitation transformations for Densely Connected Convolutional Networks.Given the problems in densely connected convolutional neural networks,this paper constructs an efficient lightweight convolutional neural network,Densely connected and InterSparse convolutional Networks with aggregated Squeeze-and-Excitation transformations(Denis Net-SE).The network adopts both dense connection and grouped convolution,which strengths the feature reuse,increases the cardinality of transformations,and reduces the model size.By further introducing the Squeeze-and-Excitation(SE)block and Squeeze-ExcitationResidual(SERE)block,a channel-level attention mechanism is constructed for feature selection to improve the performance of the network.The experimental results on the three benchmark datasets(CIFAR-10,CIFAR-100,Image Net)for image classification all show the good performance of the lightweight network.(2)Dual-path CNN with Max Gated block for Text-Based Person Re-identification.To solve the problem of modal heterogeneity in text-based person re-identification,this paper constructs a Dual-path CNN with Max Gated block(DCMG).The proposed framework is based on two deep residual CNNs jointly optimized with cross-modal projection matching loss and cross-modal projection classification loss to embed the two modalities into a joint feature space.The pre-trained language model,BERT,and the residual convolutional neural network are combined to obtain discriminative word embeddings.The Global Max Pooling(GMP)layer makes the visual-textual features focus more on the salient part.The gated block,GB,is further proposed to produce an attention map to suppress the noise of the max-pooled features.Finally,extensive experiments are conducted on the benchmark dataset(CUHK-PEDES)outperforms the state-of-the-art method.We also evaluate our method on two generic retrieval datasets(Flickr30K,Oxford-102 Flowers)and obtain the competitive performance.
Keywords/Search Tags:Convolutional Neural Network, Attention Mechanism, Person Re-identification, Cross-modal Matching
PDF Full Text Request
Related items