Font Size: a A A

Research On Abnormality Detection And De-redundancy In Wireless Capsule Endoscopy Videos Using Deep Learning Techniques

Posted on:2022-03-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:L B LanFull Text:PDF
GTID:1480306536965389Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Wireless capsule endoscopy(WCE)has been the most important innovation in endoscopy since video endoscopy replaced fiber optics.It is the preferred unparalleled modality for diagnosis and assessment of small bowel diseases due to its many advantages,especially for its painless and noninvasive inspection.However,associated with this advanced technology,there are still some challenges to be solved.One main problem is that during one WCE procedure,amounts of images with high redundancy are generated for the whole digestive tract,but only a small percentage of video data is useful for diagnosis.Manually reviewing these WCE frames is time-consuming and hard for an experienced clinician,and it does not guarantee that some important abnormal information is not missed.Another is that since the WCE have been approved for clinical use in 2001,various methods,used for the classification and segmentation tasks of the WCE abnormal images,and for reducing redundancy and reviewing time,are proposed.And more recently,artificial intelligence technology is also applied to endoscopic image analysis.All of which can enhance the diagnostic yield of WCE.However,there have still been few reports concerning the detection of lesion regions in WCE video until our work of abnormality detection was completed.Particularly,there are fewer reports using deep learning techniques to WCE image analysis.Additionally,in actual clinical practice,the clinician would always like to confirm the detection results generated by the computer techniques and not take any risk of missing important something in the WCE examination.This puts forward a higher expectation on the effectiveness and reliability of computer technology.All the above-mentioned problems motivate us to explore,experiment and develop some new and reliable computer aided diagnosis methods to provide safe and reliable diagnosis results.It is awfully beneficial to find these methods to help clinicians identify problematic images as quickly as possible,so as to reduce the workload of reviewing images,and improve the efficiency.Thus,based on deep learning techniques,we have carried out three main research work for WCE videos from the following two aspects: abnormality pattern detection and redundancy removal.(1)We address the abnormality pattern detection problem in WCE images by using deep convolutional neural network(CNN).In this work,we attempt to use a convolutional neural network to achieve the detection task of lesion regions.To this end,several methods are adopted to boost the abnormality detection performance of the proposed model from the following aspects: the design of CNN architecture,region proposal,and transfer learning etc.First,we present a Cascade Proposal network model,which consists of a region proposal rejection module and a Detection module.Second,we use multiregional combination(MRC)method to obtain a good coverage of region proposals.Third,we use dense region fusion(DRF)method to object boundary refinement.Fourth,we introduce negative category(Neg)and transfer learning(TL)strategies into our CNN model.At training time,we use the MRC method to train the Cascade Proposal network in an end-to-end manner,so as to generate a small number of region proposals with high recall.And meanwhile,we use both Neg and TL strategies to accelerate the training process and improve the generalization ability of the model.At testing time,we use the DRF and salient region segmentation methods to improve the detection accuracy.Extensive experiments are performed on our WCE2017 image dataset containing more than 7k annotated images.The comprehensive results demonstrate that our model and method are efficient and effective for WCE abnormality detection with a high localization accuracy.(2)We address the problem of unsupervised WCE video summarization in WCE videos by using adversarial learning.In this work,we consider unsupervised WCE video summarization,and cast it as a sequence-to-sequence learning problem.Our key idea is to learn a deep summarizer network to minimize information loss between training videos and their summaries,in an unsupervised way.To this end,we propose a hybrid yet effective unsupervised WCE video summarization method using long short-term memory(LSTM),variational autoencoder(VAE),pointer network(Ptr-Net),generative adversarial network,and de-redundancy mechanism(DM)etc.techniques.The proposed model termed Adv-Ptr-Der-SUM adopts a generative adversarial framework,consisting of a summarizer and a discriminator.The summarizer is the VAE-based LSTM architecture with Ptr-Net and DM that aims to learn the conditional probability of output sequence and provide a compact summary.The discriminator is another LSTM aimed at distinguishing between the original video and reconstructed video from the summarizer.The summarizer and discriminator are adversarially trained to optimize the summarizer and produce optimal WCE video summary.Extensive experiments on our WCE-2019-Video dataset show that our model can outperform other video summarization approaches by a large margin in both supervised and unsupervised settings.Also,the proposed model is applied to two public multimedia benchmark datasets: Sum Me and TVSum,verifying its effectiveness and generality,and demonstrating that it can achieve a competitive result.(3)We address the problem of redundancy removal in WCE videos by using correspondence matching and motion analysis.In this work,we propose a scheme,called SS-VCF-Der,to consider applying a flow field estimation between two successive WCE frames to WCE imaging motion analysis and then address the WCE de-redundancy problem based on the results of the motion analysis.To this end,we first propose a self-supervised technique for learning interframe visual correspondence representations from large amounts of raw WCE videos without human supervision,and then predicting the flow field.Our key idea is to use the natural spatial-temporal coherence of time and color in WCE videos as free supervisory signal to learn WCE visual correspondence relations from scratch.We call this procedure self-supervised visual correspondence flow learning(SS-VCF).At training time,we use three losses:forward-backward cycle-consistency loss,visual similarity loss,and color loss,to train and optimize model.At test time,we use the acquired representation to generate a flow field for analyzing pixel movement between two successive WCE frames.Furthermore,according to the resulting flow field estimation,we compute the motion intensity of motion fields between two successive frames(viz.extracted motion features),and use our proposed de-redundancy method,namely SS-VCF-MI,to select some frames as key ones with distinct scene changes in local neighborhood to achieve the purpose of de-redundancy.Extensive experiments on our collected WCE-2019-Video dataset show that our model and de-redundancy method can achieve a promising result,verifying the effectiveness of our scheme on the visual correspondence representation and redundancy removal for WCE video.
Keywords/Search Tags:Wireless Capsule Endoscopy, Abnormality Pattern Detection, Video Summarization, De-redundancy, Deep Learning
PDF Full Text Request
Related items