Breast cancer is the most common cancer among women.Micro-calcification cluster on X-ray mammogram is one of the most important abnormalities,and it is effective for early cancer detection.Surrounding Region Dependence Method(SRDM),a statistical texture analysis method is applied for detecting Regions of Interest(ROIs)containing micro-calcifications.Inspired by the SRDM,we present a method that extract gray and other features which are effective to predict the positive and negative regions of micro-calcifications clusters in mammogram.By constructing a set of artificial images only containing micro-calcifications,we locate the suspicious pixels of calcifications of a SRDM matrix in original image map.Features are extracted based on these pixels for imbalance date and then the repeated random sub-sampling method and Random Forest(RF)classifier are used for classification.True Positive(TP)rate and False Positive(FP)can reflect how the result will be.The TP rate is 90% and FP rate is 88.8% when the threshold q is 10.We draw the Receiver Operating Characteristic(ROC)curve and the Area Under the ROC Curve(AUC)value reaches 0.9224.The experiment indicates that our method is effective.A novel regions of micro-calcifications clusters detection method is developed,which is based on new features for imbalance data in mammography,and it can be considered to help improving the accuracy of computer aided diagnosis breast cancer.This paper shows contents as following aspect:1.Detail description of the SRDM method is introduced and the feasibility of the method is analyzed theoretically.We also tell why the method is proposed.2.Detail description of our method based on the inverse mapping of SRDM matrix is introduced,based on the pixels which are mapped from the SRDM matrix,some features closely related to original image are extracted,the features are described and are proved effectively.3.Because of the imbalanced samples,we adopt the random sub-sampling method and random forest classifier to make classification.Then we calculate the accuracy to judge how all the methods perform,including the original SRDM method and other methods from some other papers and our method.All of the results based on the same database.Same as most of the images processing methods,our method includes preprocessing,breast regions extraction,cut breast regions to be sub-images,using our method to process sub-images,features extraction and making classification,calculating the results.The detail procedure is explained as following: There are labels in most mammogram and the background is one interference factor,so we must remove the labels and extract the edge of the breast region.Then the breast region is extracted and is cut to be many sub-images which are 128*128 in pixel,overlapping exists in these sub-images and the details will be explained in the following section.These sub-images will be processed with the SRDM method,when we get the matrix the original method and our method will be used,in other words,different features are extracted.Finally the random forest classifier is used to predict the results and the accuracy is calculated and is compared. |