Font Size: a A A

The Application Of Multi-instance Learning Method For Mass Retrieval In Digitized Mammograms

Posted on:2013-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:P F LuFull Text:PDF
GTID:2248330371461860Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Breast cancer is one of the leading causes of death among the middle-aged women。In China, theincidence of breast cancer presents persistent high growth. Early diagnosis and treatment caneffective increase the chances of survival for the patients of breast cancer. Mammography hasbecome one of the most popular approaches for early detection of breast cancer in the currentclinical environment. The studies show that computer-aided diagnosis (CAD) techniques can assistradiologists to detect masses and micro-calcifications in mammograms, but, accuracy to detectmasses with current CAD is still poor. Recently, content-based image retrieval (CBIR) techniqueshave been used widely in various CAD schemes. Relevant studies show that CBIR techniques canhelp clinicians to improve mass detection precision.In clinical diagnosis, the benign or malignant lesion and the normal tissue are physically adjacent ina ROI. The classical technique framework for CBIR is query by example (QBE), however, the QBEframework only based on feature matching can not solve the“semantic gap”problem well in imageretrieval, and often needs to be combined with (supervised) machine learning approaches toimprove the retrieval precision. The query mass given by clinicians is often ambiguity and difficultto be described which makes it not a best choice to apply supervised learning based approaches todeal with mass retrieval problem. Multi-Instance Learning (MIL) is a new machine learningframework for learning from ambiguity mentioned before. Unlike supervised learning, the trainingset is a composition of bag and its label; the labels are only marked to bags of instance. A bag islabeled positive if at least one instance in that bag is positive, otherwise the bag is labeled negative.The goal of MIL is to predict the labels of new bags based on the labeled bags as the training set.MIL is applied in the CBIR systems, in which each image is deemed as a labeled bag, and thesegmented regions in the images correspond to the instances in that bag. Then the MIL algorithmsare used to learn from the concept of insterests to users, and retrieval relevant images containsimilar concept.The objective of this paper is to research the implement of the MIL techniques in mass retrievaltask. In mammogram retrieval system, the query mass is ambiguity and difficult to be describedbecause in which the lesion and the normal tissue are physically adjacent. If the query mass can beprocessed as an image bag, then the ambiguity can be tackled by MIL techniques. In this paper, weproposed three image bag generators and used MIL algorithms to learn the target points andretrieval. An experimental study was taken to make a comparison of retrieval performance of threebag generators under different MIL algorithms. In the experiment, a bag generator called SBN is compared with three bag generators. This paper consists of three parts.In the first part, three image bag generators, which named J-Bag, A-Bag and K-Bag respectively,were proposed. J-Bag is based on the JSEG image segementation algorithm, A-Bag is based on asaliency-based bottom-up visual attention computational model and K-Bag is based on themodified k-means clustering image segementation algorithm. Finally the mass image is thenconverted into a corresponding image bag consisting of four 4-dimensional feature vectors. In thesecond part, two different mass databases were created. One is DDSM database, the other database,where the images were collected from the Zhejiang Cancer Hospital in China. In the last part, in thetraining phase, for each mass type, several positive query examples and several negative examplesare randomly selected. After that, a bag generator is chosen for transforming the mass images intoimage bags, and then the target concept is learned by Diverse Density (DD), EM-DD and BP-MIP,respectively. After the target concept has been learned, the remaining mass images in the test set areranked based on their distance to the learned concept. Experimental results show that: The MILtechniques can be applied to mammograms retrieval systems. The proposed bag generators A-Bagand K-Bag can achieve more efficient results than the existing bag generator SBN. EM-DDalgotithm get the best retrieval performance.Finally, give the summaries and predict some areas need to improvement in furtrue work.
Keywords/Search Tags:breast CAD, image retrieval, multi-instance learning, computer-aided diagnosis, bag generator
PDF Full Text Request
Related items