Font Size: a A A

Pathological Image Representation Learning Based On Autoencoder Methodological Research

Posted on:2022-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:2504306605966959Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In order to apply artificial intelligence to the large amount of unlabeled data collected from the world,a key challenge is to use weakly supervised or unsupervised learning methods to learn useful representations.The image resolution of traditional medical imaging is often insufficient to fully express clear cell information.The full slice imaging technology in the era of digital pathology brings multi-scale,high-precision and clearer digital visualization data,making pathological images with more microscopic details information.The powerful feature extraction capabilities of convolutional neural networks can effectively extract features in digital pathology images.However,the absence of pathologists and the difficulty of labeling pathology images make it difficult to obtain labeled samples.Unsupervised methods are used to obtain reliable representations of pathology.It has become a general trend to conduct research to avoid the waste of medical resources.Unsupervised learning aims to extract the key features of data without labeling.The quality of feature representation directly determines the final effect of downstream tasks.The measurement of representation must not only be measured by corresponding indicators,but more importantly,it must not make representations become unexplainable "black boxes".This is especially true in unsupervised learning in the medical field.Therefore,feature visualization technology is to ensure feature reliability.As well as the key to medical interpretability,this article is based on the idea of variational autoencoders to obtain good representations in liver cancer pathology datasets to solve medical interpretable problems,and use representations to solve different downstream tasks.The main research contents are as follows:1)Design an unsupervised autoencoder feature extraction model to characterize the liver cancer dataset.It can automatically perform feature extraction on batches of pathological slices,compare the characterization capabilities of different encoders through different tasks,and display features through visual means such as reconstructing images and generating images to ensure feature reliability.2)Use the dimensionality reduction capability of the autoencoder to visualize data dimensionality reduction to initially analyze the original data set and compare the differences between different dimensionality reduction methods.3)Design a classification model based on the autoencoder.This model quantifies the characterization ability of the autoencoder according to the classification index,and further verifies whether the features extracted in an unsupervised manner can be qualified for mainstream classification tasks.4)Design an interpretable algorithm for the high-risk features of the liver cancer dataset.The algorithm quantitatively screens out high-risk features by analyzing the correlation between the characterization information of the liver cancer data set and the patient’s prognostic information,and uses the generated features of the autoencoder to visualize the high-risk features,and endows the high-risk features with medical interpretability.The experimental results show that the image reconstructed by the autoencoder model has good sharpness compared with the original image,and the mean square error is as small as0.0122,which shows that the combination of the reconstructed image and the generated image of the variational autoencoder series is obtained The characterization can capture the key features of the data set.In this paper,the accuracy of the classification model based on the autoencoder reaches 91.4%,which represents that the unsupervised feature extraction method has a certain gap with the supervised learning,but it has achieved a good representation effect.The visualization of high-risk features has found medically interpretable features such as the proportion of capillaries,the proportion of dark nuclei,and the color of the cytoplasm,which have a strong correlation with the prognosis of cancer patients.It further shows that the model has high-level features.It is an innovative key technical means based on unsupervised analysis and interpretability.The visual comparison with PCA and T-SNE in 2D and 3D also shows that the autoencoder has good dimensionality reduction capabilities and can initially understand the original data set.In this paper,the liver cancer pathology generation model can generate data that does not exist in the original data set.
Keywords/Search Tags:Unsupervised, representation learning, autoencoder, pathological image, interpretable
PDF Full Text Request
Related items