Font Size: a A A

From Human Body To Face

Posted on:2018-05-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:X JinFull Text:PDF
GTID:1368330590466660Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Many objects in computer vision research community can be decoupled into a set of anatomic parts,for example,human body can be represented as a combination of head,torso and limbs,while face can be represented by eyes,nose,mouth,chin,etc.Detecting the parts of an object is a fundamental and important task in computer vision.Based on part detection,we can:1)perform size and pose nor?malization for the objects of interests;2)extract regions of interest;3)extract part-based local features;4)do part-based high-level inference,etc.Besides anatomic parts,we can also detect whether a given image contains some specially defined parts of an object to verify certain property of an image.For example,we can recognize pornographic images by verifying whether the image contains pornographic human body parts*.Generally,the multiple parts of an object are subjected to certain spatial relationship,which is very useful when detecting these parts simultaneously.However,the spatial relationship differs in the strength for different objects.For example,the parts of human body are rather flexible-an arm can either be on the shoulder or under the shoulder,while the relationship between facial parts is much more stationary-the eyes are always on top of the nose.In addition,if our goal is to detect single kind of parts(e.g.,pornographic parts)rather than multiple kinds of parts,part detection will be degenerated into the general object detection problem,which usually does not require to consider the spatial relationship between parts.Current works typically treat different part detection task as individual research topics.For exarm?ple,human body part detection(i.e.,pose estimation)and facial part detection(i.e.,face alignment)are usually considered as completely different tasks.To the best of our knowledge,there is currently no work laterally studying and analyzing different part detection task,revealing the difference and relation?ship between them.To fill this gap,according to the order of increasing strength of spatial relationship-from human body to face t,we choose three concrete part detection tasks:none spatial relationship-pornographic body part detection,flexible spatial relationship-human body part detection,and station?ary spatial relationship-facial part detection,propose our novel algorithms for each of the tasks,and present discussions about the differences between the modeling principles of different part detection tasks with different strength of spatial relationship,and several other important issues.Specifically,the main contributions of this paper can be summarized as following:1)We study the problem of pornographic image recognition using part detection techniques.Specifically,we first give the definition of pornographic parts,including key pornographic parts and targeted pornographic parts(targets for detection),then define a measure of the degree of pornography for any region in an image,aiming to address the subjectivity and ambiguity involved in the definition of targeted pornographic parts,and finally apply deep multiple instance learning to explicitly incorpo?rate the information of pornographic degree of different regions in an image into the training process of the pornographic part detector.To evaluate our algorithm,we collect a large scale dataset consisting of 138,000 pornographic images and 205,000 normal images.Our algorithm produces excellent results on the test set of 100,000 pornographic images and 100,000 normal images,achieving 97.53V True Positive Rate at 1%False Positive Rate.2)We propose a novel pose expert algorithm for articulated human pose estimation.We propose to group in the pose space before articulation learning,with each group consisting of samples with similar poses or semantic meaning(e.g,actions).We then train for each group one pose estimator(called pose expert in this work)specific to that group,and when testing,we simply pick up the one output by the pose expert with highest confidence.Since we impose the global bias of specific pose during training,each pose expert can better handle examples of specific pose.We propose two pose grouping methods,and evaluate their effectiveness on two benchmarks,gaining remarkable improvement over the pose estimator trained with all data.Furthermore,we propose a robust action recognition method based on our pose experts,and achieves good performance on standard benchmarks.3)We propose a novel robust discriminative Hough voting method for face alignment.We first unify PCA-based point distribution model and the shape exemplar-based model in a probabilistic Con?strained Local Model(CLM)framework,and then propose our Hough voting-based method by extend?ing the shape exemplar model.Compared to the baseline,our method uses much less anchor points,and is very robust to inaccurate anchor points,and meanwhile,we propose a discriminative method to select good exemplars that fit well to the given face.Our face alignment method achieves promising performance on four challenging face datasets.4)Based on above three part detection tasks,we summarize the impacts on the modeling principles by different spatial relationship strengths,and present a series of discussions about the shape model,which lies in the core of a part detection algorithm.These discussions include the motivation,advan?tages and limitations of recent popular implicit shape model,the data distribution assumption behind the classic tree-structured model and PCA-based shape model,and the possible methods for changing the shape modefs flexibility.These discussions may lead to a better understanding of the reasonability and limitation of current best pose estimation and face alignment methods,and help to inspire advanced algorithms for part detection tasks.
Keywords/Search Tags:part detection, pornographic images, pose estimation, face alignment, multiple instance learning, deep learning, spatial constraint, shape model, tree structured model
PDF Full Text Request
Related items