Font Size: a A A

Research On Coding Algorithms For Visual Analysis Tasks

Posted on:2022-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:Q WangFull Text:PDF
GTID:2518306722952009Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Traditional image and video coding algorithms can effectively reduce the bit-rate of images and videos,while maintaining the visual viewing experience of the encoded images and videos.However,with the advancement of surveillance systems and visual analysis applications,more and more visual data will be directly analyzed by visual analysis tasks after being transmitted,thus there is no need for human viewing.Current traditional coding algorithms mainly focus on the human viewing experience of the encoded images and videos,while ignoring their analysis performance in visual analysis tasks.Therefore,in order to effectively support human viewing and machine analysis,this paper explores the image and video coding algorithms for visual analysis tasks.For image coding methods,this paper proposes a traditional image coding algorithms and an end-to-end image coding algorithms for visual analysis tasks.For video coding methods,this paper proposes a scalable video coding algorithm for pedestrian detection tasks.The main research contents and contributions are as follows:1.A HEVC(High Efficiency Video Coding)intra coding algorithm that is oriented to recognition task is proposed to improve the recognition performance of the encoded images.The traditional image coding algorithm BPG(Better Portable Graphics)based on HEVC intra coding algorithm can achieve efficient image coding,and it exceeds the image coding formats JPEG(Joint Photographic Experts Group)and JPEG2000 in terms of coding efficiency and image quality.However,HEVC intra coding algorithm does not consider the influence of the reconstruction quality of different regions on recognition tasks,which leads to the degradation of recognition performance after encoding at low bit-rate.Therefore,our algorithm combines the visualization information of the recognition network and the deep neural network feature information to explore the regions that are more important for the recognition tasks.By adjusting the bit allocation strategy of HEVC intra coding,the regions will be allocated more bits and maintain better reconstruction quality.Experiment results demonstrate that our algorithm achieves 19.12% bit-rate saving while maintaining the same recognition accuracy,and it also has similar visual quality compared to the existing coding algorithms.2.An end-to-end image coding algorithm based on semantic prior attention module is proposed to improve the visual analysis performance of the encoded images in different visual analysis tasks.The current image coding algorithms treat different regions and different deep neural network features equally,and do not take machine perception quality into account,which seriously affects the visual analysis performance of the encoded images.Therefore,our algorithm combines semantic prior information with attention module to focus the semantic-related content and enhance the semantic-related deep neural network features.Besides,a cascaded coding framework is constructed to guarantee the machine perception quality of the semantic-related content.Compared with the advanced image coding algorithms,our algorithm achieves average 7% recognition accuracy improvement.For various visual analysis tasks,our algorithm has greater analysis performance,which verifies the generality of our algorithm.3.A hierarchical video coding algorithm for pedestrian detection tasks is proposed to improve the pedestrian detection accuracy of the encoded videos.Considering the irregularity of pedestrian movement on urban roads,it is necessary to imply pedestrian detection on urban roads.Considering the coding algorithms for urban surveillance videos do not consider the structure information and the diversity of categories,our algorithm introduces semantic segmentation technology to extract the category information and the structure information in the scenes at the encoding,and the generative adversarial network generates the pseudo video frames with the consideration of the structure information at the decoding.Besides,the pedestrian regions are extracted and encoded as the bit stream of enhancement layer to ensure the fidelity of the texture and color.Experimental results show that our algorithm can not only improve the pedestrian detection accuracy of the encoded videos,but also maintain the visual quality of the pedestrian regions.
Keywords/Search Tags:HEVC intra coding, image coding, visual analysis task, image recognition, pedestrian detection
PDF Full Text Request
Related items