Mammary gland is an important organ that secretes milk and feeds offspring,and breast tumor is a benign or malignant tumor in mammary gland.Breast cancer is an important killer threatening women’s health because of its various causes and rising incidence.Mammography is the first choice for breast cancer screening because of its simple operation,high resolution and high reliability.Among them,mammography examination report records the patient’s image performance,the doctor’s clinical diagnosis and diagnosis opinions,which is the high-level display and summary of the image and contains rich semantic information.Therefore,making full use of deep learning technology to assist doctors in Artificial Intelligence(AI)diagnosis of breast cancer is an important embodiment of AI assisted medical treatment,which helps to improve doctors’ work efficiency and relieve working pressure.In recent years,a large number of researchers have been interested in the AI diagnosis of breast tumors.A variety of diagnostic models have been born,but they have been unable to be applied to clinical diagnosis.The main reasons are: the accuracy of the traditional AI diagnosis model based on machine learning is not high.However,the accuracy of the current breast AI diagnosis model based on deep learning is improved,but the transparency and interpretability of the diagnosis process are insufficient.In order to solve the problems of the low accuracy of traditional diagnostic models and the lack of interpretation of modern diagnostic models,this paper used the real breast moly target report data set from a third-class A hospital in Shanghai to propose an interpretable AI diagnosis model for breast tumors based on semantic embedding and capsule network.An interpretable auxiliary diagnosis system for breast tumors was designed and implemented to assist doctors in efficient diagnosis.The main work of this paper has the following three aspects.1)A method of breast molybdenum fusion semantic embedding to obtain word vectors is proposedFirstly,the characteristics of mammography report collected from a Grade A hospital in Shanghai were analyzed.Then,the Report was segmented using the Report semantic segmentation algorithm,And the labels extracted And divided from the Breast Image Report And Data System(BI-RADS)classification results were concurrently established to complete the preparation of the Data set.Then,based on the rule of "segment-organization description sentence-attribute description sentence",a format semantic tree of Extensible Markup Language(XML)containing a single focus and a single part is constructed by dependency syntax.Finally,the semantic tree was integrated into Bidirectional Encoder Representation from Transformers(BERT)to achieve the breast molybdate fusion semantic embedding pre-training method to obtain the word vector,which is targeted at the medical field and contains the hierarchical relationship of breast tissue.2)Construction of a model of breast diagnosis prediction and interpretability based on capsule networkIn this model,the pre-trained word vector is used in the multi-head attention and capsule network to realize the benign and malignant diagnosis and prediction of breast tumors.To begin with,multi-head semantic representation of attention extracted features.Besides,the improved capsule network was used to achieve prediction classification,which extended the capsule network from computer vision classification task to prediction classification task of breast molybdenum target report.The experimental results show that the micro mean accuracy rate,micro mean recall rate,micro mean F1 score,macro mean accuracy rate,macro mean recall rate and macro mean F1 score of the proposed model are 91.58%,91.58%,91.58%,75.95%,79.73% and 77.14%respectively.At the same time,the model integrates four methods to explain the prediction results of the example,including local self-interpretation based on back propagation principle and dynamic routing algorithm,global self-interpretation based on multi head vector word frequency statistics,model independent local interpretable Local Interpretable Model-agnostic Explanations(LIME)method based on disturbance,model independent global interpretable SHapley Additive ex Planations(SHAP)method based on game theory.3)Designed and implemented an interpretable auxiliary diagnosis system for breast tumorsThe system is based on the semantic tree embedded with BERT pre-trained word vector extraction module and the breast diagnosis prediction and interpretability model based on multihead attention and capsule network,which can help doctors to give better diagnosis advice.Through the software development steps of demand analysis,use case analysis,system design,database design and so on,doctors mainly realize the functions of patient molybdenum target report management,diagnosis prediction management,interpretability analysis and so on.The administrator mainly manages the user rights and user information to better ensure the system security.In summary,in this paper,an interpretable auxiliary diagnosis system for breast tumors is developed by using the description of the word vector obtained by the breast molybdenum-based semantic embedding BERT pre-training,and the construction of the diagnosis prediction and interpretable model of breast tumors. |