Font Size: a A A

Research And Implementation Of Multi-disease Diagnosis On Chest X-ray Based On Vision Transformer

Posted on:2024-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:J MaFull Text:PDF
GTID:2544307064985679Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Chest X-ray has the advantages of quickness and low cost,and is one of the most common examination methods for diagnosing chest diseases in clinical medicine.The current wide application of deep learning promotes the rapid development of computeraided medical diagnosis.Chest X-ray disease diagnosis based on deep learning method has broad prospects as an efficient early screening method,which can assist doctors to quickly screen out suspicious diseases,relieve doctors’ diagnostic pressure,and reduce the rate of misdiagnosis and missed diagnosis of diseases.Previous chest X-ray disease diagnosis works were based on convolutional neural network(CNN)to identify and locate chest diseases.In recent years,due to the excellent performance of the self-attention model(Transformer)in the field of natural language processing,many scholars have tried to migrate it to the field of computer vision.Compared with CNN,Transformer can more efficiently capture the long-term dependencies between sequences,more fully mine the correlation between sequences,and the extracted features contain more semantic information.Taking advantage of the Transformer architecture,this paper uses the visual self-attention model as the basic network structure to conduct research around improving the model’s attention to lesion information and the diagnosis of diversified information fusion.Two chest disease diagnosis models are focused on.The main research work is as follows:1.The model 1 is mainly aimed at problems such as small disease lesions in chest X-rays,difficulty in identifying,and insufficient and unbalanced data.On the one hand,the image block is segmented based on the sliding window method,so as to ensure that the small lesion area is not divided into multiple image blocks as much as possible;Focus on the direction and increase the model’s attention to key lesion areas.To solve the problem of data imbalance,this paper adds the weight information corresponding to each disease on the basis of the binary classification cross entropy function,reduces the weight of healthy sample diseases in the loss,and enhances the learning ability of the network for diseased samples.The experimental results show that the average AUC value of the model in the diagnosis of 14 common chest diseases reaches 0.831,which has a good diagnostic ability of chest diseases.2.The model 2 mainly focuses on improving the diagnostic accuracy of the model through multivariate information fusion.In order to solve the limitation that a single image data cannot contain all the effective information required for disease diagnosis,this paper further introduces patient metadata information on the basis of chest X-ray image data.To this end,this study introduces patient metadata information based on chest X-ray image data.The chest X-ray image data and patient metadata are processed by two separate feature extraction networks to obtain higher-dimensional feature vectors.These features are then fused and fed into a classifier to enable accurate diagnosis of chest diseases.In addition,this paper proposes a new data division method for the ChestX-ray14 dataset on the basis of ensuring that the image data of the same patient does not exist in the training,verification,and test sets at the same time.Through the new data set division method,this paper converts a small part of redundant image data into the patient’s past medical history information,and jointly trains with other metadata information so that the model can learn the association between various chest diseases and further improve the diagnosis of the model ability.Experiments show that this paper uses the metadata transformed from image information to enable the model to effectively learn the correlation between various chest diseases.At the same time,the multimodal feature extraction network constructed in this paper can effectively integrate multi-modal input data,and has image-semantic information collaborative learning ability and expansion ability.With the continuous collection and improvement of diversified case information,it will have broad development prospects in clinical applications.
Keywords/Search Tags:Medical image classification, Chest X-ray, Disease diagnosis, Vision Transformer, Metadata
PDF Full Text Request
Related items