Font Size: a A A

Research On Face Annotation In News Images Based On Images And Captions

Posted on:2019-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhengFull Text:PDF
GTID:2428330566970910Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The images in news often contain faces,and the names of faces are often contained in the corresponding captions.The task of face annotation in news images based on images and captions is to use a news image-caption dataset to learn a face annotation model,and utilize it to annotate faces in unknown news images.Face annotation is a supporting technology in the field of news content automatic analysis,news aggregation and so on,and has broad application prospects.Currently,face annotation in news images based on images and captions usually includes three key steps: preprocessing,face disambiguation and face annotation model training.The preprocessing is responsible for extracting faces and names from the original news image-caption dataset,obtaining a weakly labeled face dataset.The face disambiguation is to find corresponding relationship between faces and names in the weakly labeled face dataset,obtaining a strongly labeled face dataset.The face annotation model training is to train a face annotation model based on the strongly labeled dataset.Although great achievements have been made in existing researches,there are still some shortcomings in preprocessing,face disambiguation and face annotation model training:(1)Sometimes,there are some background faces in images that do not have analysis value.Theses background faces will cause interference to face disambiguation,but the existing preprocessing does not explore ways of eliminating background faces.(2)The existing disambiguation algorithms have insufficient use of the constraints between faces.The existing algorithms usually only utilize the constraint that similar faces have same names,but lack the utilization of the constraint that faces with huge differences have different names.(3)In the process of face annotation model training,the existing methods usually only use face similarity information,but multi-modal information besides face similarity information that is useful for face annotation is not used adequately.This thesis focuses on the above problems,and includes works as follows:(1)Aiming at the problem of background face,a background face deletion algorithm based on robust principal component analysis is proposed,which detects and deletes background faces in the preprocessing step.Based on the hypothesis that background faces are outliers in news face dataset,background faces are detected by measuring the outlierness scores of faces.Specifically,first of all,the training face dataset are sampled and multiple training subsets are obtained.Secondly,based on the robust principal component analysis algorithm,each training subset is used to train a background face base detector to measure the outlierness scores of faces independently,and the final outlierness scores of faces is set as the sum of the outputs of all base detectors.Then faces with high outlierness scores are identified as the background faces and deleted.Experimental results on public news face datasets show that the proposed algorithm has better background face detection effect than benchmark detection algorithms,and can filter background faces better.(2)Aiming at the problem of insufficient utilization of face constraints in existing face disambiguation algorithms,a face disambiguation algorithm based on pairwise constraints is proposed.The algorithm takes advantage of two face constraints,including the constrain that similar faces have same names,and the constrain that faces with huge difference have different names.Firstly,the influence of the widespread data-imbalance problem in the news face dataset on low rank representation algorithm is studied,and the way of low rank representation coefficient representing face similarity is analysed,and the pairwise constraints between faces are extracted according to the face similarity.Then graph models based on pairwise constrains are established,and face disambiguation is completed by minimizing the energy function based on the graph models.Experimental results on public news face datasets demonstrate that the proposed method has better face disambiguation accuracy than benchmark face disambiguation algorithms.(3)Aiming at the problem that the exsiting face annotation model training algorithms usually only use face similarity information and lack the use of other multi-modal information that is useful for face annotation,a face annotation algorithm based on multi-modal information fusion is proposed.Firstly,The algorithm extracts multi-modal information including face name matching degree based on face similarity,face size,face position,face definition and name position.Secondly,a face annotation model based on multi-modal information fusion is trained based on the face disambiguation results of the face training dataset.The experimental results on public news face dataset show that multi-modal information is helpful to improve the effect of face annotation model,and the face annotation model based on multi-modal information fusion has a better annotation accuracy compared with the annotation models that only use face similarity.
Keywords/Search Tags:News Images, Face Annotation, Robust Principal Component Analysis, Pairwise Constraints, Multi-modal Information
PDF Full Text Request
Related items