With the rapid development of modern medical imaging technology, the Medical image has become the important auxiliary diagnosis technology. But with kinds of medical imaging equipment, such as CT, MR, DSA, DR, and much computer technology integrating into the imaging diagnosis, the problem of utilizing the medical image source has become more and more urgent. Every day mass medical image generates from the hospital, and if inventing a way to auto marking the category of the medical image, the workload of the doctor could be reduced greatly, and it could improve the utilizing rate of the medical image. The medical image classification has become a very urgent demand.The traditional methods of imaging classification are based on the global feature, such as color, texture, shape and so on, and these methods could make good performance, but as the idea of these methods is simple relatively, the development of these methods has been limited.The basic idea of "bag of words"(which is usually written as bow) model is constructing the term-doc matrix with the dictionary. If applying the bow model into the image field, the bag of visual words could be constructed accordingly, so that the algorithm of text field could be applied into the image field, and these two fields can make a fine mixture. The mixture of different fields could boost the new ideas for the developing of different fields, so this could facilitate the development of each field.The bag of words model was introduced into the medical image classification problem successfully in this paper. The basic idea is described as below. The first step is extracting the SIFT features of the image; the second step is clustering all the features and generating a visual vocabulary, and then constructing the visual words expression of each image; the last step is inputting the expression into SVM to train and test. The experimental result showed that, the bag of words model could improve the classification accuracy of the medical image. In order to solve the slow-footed problem of the bow model, the kd-tree algorithm is introduced to improve the speed of the bow model. The high dimension index could be built with the kd-tree algorithm, and then the visual word expression could be calculated according this index. The experiments showed that, the speed of constructing the visual word expression of each image could be improved largely with this method, so that the speed of the bow model could be improved. Similarly with the text field, if the bow model is introduced into the medical image field, there also exists the problem of polysemy and synonyms. In this paper, the PLSA topic model was combined with the bow model to solve the problem of polysemy and synonyms in the bow model, and the classification accuracy of the method based on the bow model was improved again. |