Font Size: a A A

Research On Content Analysis And Processing Technology Based On Stereoscopic Vision

Posted on:2018-08-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:W J GengFull Text:PDF
GTID:1318330512999392Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
VR,AR and IMAX3D have become hot topics in recent years.Because of the widespread use of acquisition equipment and the explosion of stereoscopic media,a mushrooming number of people have the opportunity to know,use and study stereo-scopic media.Although there are multiple expressions in the family of stereoscopic media,this thesis mainly focuses on the content analysis and processing technology of binocular media which is close to human stereo vision.Compared to traditional tech-nology of multimedia information processing,the key to stereo media processing lies in mining and utilizing the relationship between different views.The extra information preserved in binocular media not only brings more cues but also much noise in the fur-ther processing.Therefore,it is imperative to explore new solutions for stereoscopic media to improve both efficiency and quality of media processing.Aimed at several key technologies,the thesis analyses the existing problems and offers corresponding solu-tions according to the survey about domestic and foreign research in content analysis and processing technology of stereoscopic media.The novelty and major contributions of this thesis are summarized as follows.1.A novel bidirectional motion-based interpolation method for binocular depth estimation,making use of redundancy between frames and the adaptive motion estimation,which improves the computing efficiency and keeps the con-sistency among depth sequences.Most existing methods about depth estimation start from stereo matching,which can achieve good results by setting the exact depth range.Therefore,conducting stereo matching frame-by-frame may lead to temporally incon-sistent depth sequences.To preserve the temporal consistency,some methods turn to global optimization for help which bring inevitably computing cost.Considering the character of binocular videos,a fast bidirectional motion-based interpolation method is proposed by leveraging the coarse to fine depth calculation.Experiments show that the proposed method is competent for fast and accurate binocular video depth acquisition.2.A novel object proposal method for proposing multi-objects in videos,con-tributing to suppressing the proposal inconsistency and improving computing ef-ficiency by constructing the context-aware object proposal model.Most previous works focus on image object proposals,and the existing explorations about video ob-ject proposals mainly start from frame-by-frame object proposals and concentrate on localizing moving or dominant objects.Experiments show that frame-by-frame object proposals may lead to computing redundancy and proposal inconsistency.To tackle these problems,an adaptive context-aware model for video object proposals is pro-posed.The adaptive window mapping strategy leverages the spatial and temporal pro-posals,contributing to presenting a general solution for applying image proposals to videos.Furthermore,a multi-object dataset with 3.34 objects per frame is built for promoting further research on video object proposals.3.A novel multiple salient object detection method based on view fusion,fur-ther improving the mean average precision by utilizing the detection inconsisten-cy occurred between different views.Salient object detection is generally based on the assumption that there is only one salient object in the scene,and lacking enough explorations on multiple salient object detection,especially in stereoscopic images.Experiments find that separately conducting salient object detection on different views may cause detection inconsistency.To solve this problem,a view fusion based multiple salient object detection is proposed.By exploring the relationship between candidate salient windows generated from different views,a dual-probabilistic estimation strate-gy considering the probability of saliency and object is introduced to refine the output scores for candidate windows,which shows the priority on the precision-recall of mul-tiple salient object detection.4.A novel glasses-free 3D demonstration,serving the widespread stereoscopic images and providing a specific idea for autostereoscopy.Expensive and cumber-some 3D equipment currently limits the popularization of emerging stereo media on the Internet.Considering the lack of detailed analysis and modeling on human visual system(HVS),which makes the existing wiggle stereoscopy flicker a lot and deliv-ers uncomfortable 3D perception,a Flat3D method for animating stereoscopy only through a conventional screen is proposed based on HVS,motion parallax and visual persistence.Experiments show that the proposed method is a more convenient,effec-tive and automatic alternative for browsing stereo images in common flat screens.5.A novel stereo video refocusing by computational cinematographic mod-el,generating the DSLR-like refocusing results.Nowadays,stereo videos mainly serve the cinemas,VR/AR equipment,which are hardly emerged in people's daily life.In fact,utilizing the depth information preserved in stereoscopic videos can achieve abundant processing on the video content.In order to eliminate the artificial results generated by the existing software-oriented refocusing methods,a user-oriented stereo video refocusing by computational cinematographic model is proposed to create digital single lens reflex(DSLR)effects by considering the concepts of focus plane,depth-of-field(DoF)and circle of confusion(CoC)in photography.In addition,a series of prospective directions are discussed based on the researches in this thesis,exhibiting the systematicness and extensibility of this research topic.Besides,the research finding plays a fundamental role and has a bright prospect in the field of content analysis and processing technology based on stereoscopic vision.
Keywords/Search Tags:Stereoscopic media, depth calculation, object proposals, multiple salient object detection, glasses-free 3D, video refocusing
PDF Full Text Request
Related items