Font Size: a A A

Research And Implementation On Foreground And Background Separation Of Person Images

Posted on:2021-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:X F LiFull Text:PDF
GTID:2428330623968160Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Semantic segmentation is an important research in Computer Vision.This method of semantic segmentation is applied to the task of person foreground and background separation,focusing on the extraction of the person of foreground under rich and varied backgrounds,mainly for models changing clothes from TaoBao to replace background,background editing(background removal,background blur,and background replacement)in the self-portraits machine in the mall.FCN,the originator of semantic segmentation,has the advantages of fewer parameters and simple network structure,this thesis improves on this basis.First,it is found that the results have local classification errors and aliasing effects after the experiment basing on FCN.Theories are deduced according to the experimental results and the new network structure is redesigned.Based on the FCN network structure,the horizontal and vertical connection methods are redesigned respectively,obtaining a new network structure and renamed FCN'.According to the improved experimental results,the new network structure can generate clear and continuous person masks of foreground.After solving local classification errors and aliasing effects,the problem of spatial inconsistencies appears in some person's images by analyzing the icome dataset and the LIP dataset.Such images account for about 30% and 15% respectively.Aiming at this problem,a point-based spatial attention module is designed,and the unidirectional information flow between original pixels is changed to a bidirectional information flow,by analysing the Non Local operation and self-attention.Compared with the current PSP networks with higher segmentation accuracy of person images,the objective accuracy indicators mIoU and pixAcc are both improved by 3%,and at the same time,the problem of spatial inconsistency is solved in subjective vision,which makes the edges more natural and clear;Compared to the Generative and Adversarial Network,there is no noise block phenomenon.Due to the use of the attention mechanism,the complexity of the network is increased.Compared with the existing PSP neural networks,the space and time complexity have little advantages.In order to achieve faster computing goals of the network,based on the analysis of structured attention,a calculation method of attention cross-grouping is proposed.The original attention-intensive matrix is decomposed into the product of two sparse matrices,and all pixels are considered,which the accuracy improved about 1%.For the small-size feature maps in our task,time and space complexity is reduced by about 18%;while in the large-size feature maps,the reduction in time and space complexity is more obvious.Different network structures used for comparative experiments in this thesis,it proves that this method has improved the accuracy and performance on multiple data sets.At the same time,this thesis has played a guiding role in actual projects.
Keywords/Search Tags:semantic segmentation, full convolutional neural network(FCN), spatial inconsistency, attention mechanism
PDF Full Text Request
Related items