Font Size: a A A

Integrating Cost-sensitive And Deep Learning For Image Classification

Posted on:2019-04-15Degree:MasterType:Thesis
Country:ChinaCandidate:J F TanFull Text:PDF
GTID:2348330569488925Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and Internet technology,people get a geometric increased information through the Internet every day.And the image information is also growing with the continuous improvement of multimedia technology.At the same time,major corporations and government agencies are facilitating the work life through image classification technology.The general steps of image classification are data input,feature extraction,and classification.And the feature extraction of the image is an important basis for completing the classification task and directly affects the performance of the classification.In addition,the imbalanced distribution of image data in some areas will also cause many problems,which will not only affect the classification effect,but also may cause some irreversible losses due to the neglect of the importance of a few classes.Focused on the issue that in imbalanced image classification the accuracy of minority class is low,the cost of misclassification is high and feature selection manually costs too much,an imbalanced image classification approach based on convolutional neural network and cost sensitive learning was proposed(Triplet-CSSVM).This method has two parts:feature learning and cost sensitive classification.For feature learning,convolutional neural network(CNN)in deep learning is an effective method.But there is the problem that the classic CNN is oriented to balance the data set,and the details learning of the image is not enough with the loss function softmax.Therefore,this thesis uses the CNN which uses triplet loss as loss function combines with re-sampling(triplet-sampling CNN)to learn the image feature.This method can not only learn more detailed features of the image,but also balance the dataset to a certain extent.In the training of CNN,introduces the idea of transfer learning.Do the pre-training on the ImageNet dataset,and then do fine-tuning on the experimental dataset.This method can solve the problem of network convergence or data overfitting which caused by the small amount of experimental data.For cost-sensitive classification,traditional classification methods do not consider the cost information,and it is difficult to obtain good classification results on imbalanced datasets.This thesis improves the traditional SVM to cost-sensitive SVM(CSSVM)by assigning different cost factors to different classes.And the optimization goal of CSSVM is to minimize the total cost of classification.In this thesis,on the deep learning framework Caffe we use the portrait dataset FaceScrub and spam dataset personal spam to perform multiple sets of experiments by changing the distribution of datasets.The experimental results show that compared with the traditional classification methods,the proposed method can obtain better classification effect on a variety of complex imbalanced data.
Keywords/Search Tags:image classification, imbalanced data processing, Convolution Neural Networks(CNN), cost-sensitive
PDF Full Text Request
Related items