Font Size: a A A

Knowledge Graph Of Risk Factors Of Major Cancers In China

Posted on:2021-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:B W TangFull Text:PDF
GTID:2404330614964432Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Objective To construct a knowledge map of risk factors for six mian cancers including lung cancer,gastric cancer,colorectal cancer,esophageal cancer,breast cancer and ovarian cancer in China,and include the risk factors of six major cancers that have been found in domestic and foreign researches.Provide a theoretical basis for the intervention of the main cancers in our country and localities.Methods The expert consultation method was used to evaluate the risk factors database of six major cancers in the previous study,including lung cancer,gastric cancer,colorectal cancer,esophageal cancer,breast cancer and ovarian cancer,to determine whether it could be directly used to construct the knowledge graphs of cancer risk factors,and to construct the ontology framework of knowledge graphs of six main cancers.Literature review method was used to systematically comb the knowledge graphs to construct the model and relevant experimental verification methods.Wanfang Database and CNKI were searched to screen the literatures related to 6 kinds of cancers and their risk factors,so as to provide data sources for the establishment of knowledge graphs.Using artificial intelligence method to construct knowledge graph,randomly extract some documents from the filtered literature library,manually construct extraction rules and complete data extraction,mark and train the first round of extraction results,and then perform the second round of data extraction,Finally,manually review the extracted results.All the extracted entities are analyzed and classified,and the knowledge graph ontology is updated according to the classification of risk factors,and then the knowledge graph is constructed,and platform testing and content evaluation are performed on the graph.Results Through the six kinds of malignant tumor to evaluate risk factors for subject database,risk factors for gastric cancer theme database build process and the result is more reasonable and reliable,can be directly used to build knowledge graph.But lung cancer,colorectal cancer,esophageal cancer,breast cancer and ovarian cancer risk factors in the process of the construction of the subject database is not rigorous,the result of part there is a big controversy,cannot be directly used to construct the malignant tumor risk factors of knowledge graph.In the literature search results,a total of 2030 references were finally used for data extraction,among which 460 were lung cancer,388 were colorectal cancer,410 were esophageal cancer,685 were breast cancer and 87 were ovarian cancer.In the first round,518 risk factor entities,121 protective factor entities,14 high-risk population entities,5 disease entities and 769 triples were selected.On the basis of the first round,1062 risk factor entities,174 protective factor entities and 9 high-risk population entities were selected in the second round,and a total of 6,235 relationship pairs were obtained.After manual verification and entity alignment,956 entities of risk factors,241 entities of protective factors and 4 entities of high-risk groups were finally agreed upon.According to the characteristics of each risk factor,it is classified into 8categories,which are behavior and lifestyle factors,genetic factors,physical and chemical environment factors,disease factors,drug factors,social psychological factors,reproductive factors and other factors.After the completion of two rounds of extraction and manual review,the accuracy rate,recall rate and F1 value of extraction results were calculated.Among them,the accuracy rate of all the physical results extracted in the literature related to breast cancer was the lowest(47.47%),and that of ovarian cancer was the highest(77.06%).Among the results of F1 value calculation,breast cancer F1 value was the lowest(57.44%)and ovarian cancer F1 value was the highest(82.85%).By evaluating the application of the knowledge map,it is concluded that the knowledge map has high efficiency in knowledge retrieval,can directly show the correlation between different cancer species and risk factors,and is of great use value.Compared to the existing monitoring system and knowledge map,chronic disease risk factors monitoring system in our country mainly include diet,smoking,alcohol consumption,physical activity,family history,disease factors and so on,our country areas prone to digestive tract tumor monitoring content categories with similar,but in such aspects as diet,family history,disease history,there is a difference in terms of diet,the more focused on the method of making the food intake,family history and history of disease is limited to the influence factors of investigation related to the digestive tract.In addition,the contents related to physical and chemical environmental factors were not involved in the two monitoring systems.Conclusion In this study,expert consultation,literature review and artificial intelligence were used to construct knowledge atlas of risk factors for lung cancer,gastric cancer,colorectal cancer,esophageal cancer,breast cancer and ovarian cancer.The graph evaluates the cancer database with the empirical knowledge of authoritative experts and establishes the ontology framework of the knowledge graphs,which provides a reasonable,reliable and effective guarantee for constructing the core content of the knowledge atlas.Using literature as a data source,which can better show the current research situation in this field,and using machine learning as an intelligent and efficient means,the knowledge graph of cancer risk factors is constructed,and the knowledge retrieval,content comparison with monitoring system and application expansion are evaluated.The graph enables rapid retrieval of diseases,factors and their relationships,and can be updated continuously as researchers deepen their understanding of cancers,ensuring the circulation of knowledge.By expanding its application in different populations,it plays an important role in the prevention and control of cancers in the future.
Keywords/Search Tags:Cancer, risk factors, knowledge graph, artificial intelligence, monitoring, cancer prevention and control
PDF Full Text Request
Related items