With the continuous development of evidence-based medicine and informatization,clinical evidence has attracted more and more attention and attention.Randomized Controlled Trial(RCT)evidence,as the original research of the highest quality,provides evidence for the formulation of clinical evidence-based practice guidelines.Supported by high-quality evidence,a large number of TCM RCT studies have been carried out at home and abroad in recent years.There are more than 40,000 TCM RCTs in The Cochrane Library,which provides an evidence base for the formulation of TCM clinical evidence-based guidelines.At present,in the process of formulating evidence-based clinical practice guidelines,the extraction and evaluation of key information of RCT literature evidence is mainly done manually,which is time-consuming and labor-intensive,and it is difficult to ensure the comprehensiveness,accuracy and consistency of information extraction.The rapid identification,extraction,evaluation and sharing of information,this research takes the TCM RCT literature as the research object to construct the semantic model of TCM RCT evidence collection and explore the construction of TCM RCT evidence collection knowledge organization system based on the semantic model,laying a foundation for TCM RCT evidence automatic extraction.1 PurposeIn order to be able to comprehensively,accurately and automatically extract the key information of TCM RCT evidence,this study is oriented to the needs of evidence-based practice guideline formulation,starting from the current research status of TCM RCT related research,combined with previous myopia guideline formulation experience,based on evidence-based medicine PICOS(Participant/population,Intervention,Comparison,Outcome,Study design)model to explore to explore TCM RCT literature information acquisition model..On the one hand,this study investigated the evidence stage needs in the development of TCM clinical evidence-based guidelines,determined the core concepts of TCM RCT documentary evidence,provided a standardized and structured conceptual term framework,and constructed a semantic model for TCM RCT evidence collection;On the other hand,relying on ontology and natural language processing related technologies,build a knowledge organization system for TCM RCT evidence collection,and explore the feasibility of TCM RCT evidence automatic extraction,so as to provide comprehensive and fast data for evidence-based researchers,and meet the needs of clinical problem construction,evidence acquisition and evaluation in the formulation of TCM clinical evidence-based guidelines.2 Content and methodThis study investigated evidence-based medicine and RCT evidence related literature,was familiar with and mastered the evaluation standards and evaluation tools of Chinese medicine RCT evidence,and carried out the following research work:First,through the semantic annotation platform,based on the evidence-based medicine PICOS model,taking Chinese medicine treatment of myopia RCT Chinese literature as an example,literature retrieval and pretreatment were carried out.Then,according to the relevant evaluation criteria and tools obtained from the literature survey,the text structure involved in TCM RCT literature and the key information required for evidence collection were analyzed,and the core concepts and relationships related to TCM RCT evidence were extracted.After,the extracted concepts are further normalized,and the concept sources of TCM RCT semantic model are gradually determined in the process of normalization.Next,based on the Knowledge Organization System(KOS)platform developed by the project,the Chinese medicine RCT evidence semantic ontology model was constructed using a seven-step method according to FAIR principle.At the same time,200 RCT literature was manually annotated by the TCM literature entity labeling management platform,and the labeled exported data were imported into the platform as living examples.Finally,on this basis,the KOS platform quality testing tool was used for quality verification;Ontology inference machine was used to verify naming consistency and logicality.The pre-training model combined with BERT-WWM and BILSTM-CRF was used to complete data training and automatic extraction verification.In addition,the efficiency and content accuracy of automatic extraction and manual extraction are compared to verify whether automatic extraction can contribute to manual extraction and whether it can contribute to the guideline formulation of evidence extraction stage.3 ResultsThrough literature research,the core concepts of the model were determined from the Guidelines International Network(GIN)minimum data set of evidence tables and the Consolidated Standards for Reporting Trials(CONSORT)and its Extended edition of TCM,Preferred Reporting Items for Systematic Reviews and Meta-analysis Meta-analyses(PRISMA),Cochrane Handbook et al.,identified 29 core elements based on PICOS.By referring to evidence-based medicine databases such as TCM clinical research evidence base system,the semantic model is divided into basic literature information and PICOS information two modules,determine the 34 core concepts,then draw lessons from EBMO ontology,using seven steps of Stanford university school of medicine,reuse SEPIO ontology complete RCT evidence collection ontology construction,semantic model of traditional Chinese medicine A total of 85 categories,17 object attributes and 31 data attributes were constructed to realize the integration and structured representation of TCM RCT knowledge,and systematically represent the key knowledge required by RCT evidence in the evidence formulation stage of evidence-based TCM guidelines.Through literature retrieval and screening,327 Chinese literature on TCM treatment of myopia were included in the RCT.In the early stage,200 literature were randomly selected for manual annotation.After 24 rounds of model training on the BIO annotation set,the accuracy rate of TCM RCT evidence collection and validation set were 84.63%,the recall rate was 91.46%,and the F1 value was 87.91%,showing a relatively ideal effect.Then,the trained model was integrated with the TCM RCT evidence collection semantic model ontology,and the automatic TCM RCT evidence extraction was preliminarily realized.Finally,5 unlabeled RCT literature on TCM treatment of myopia were selected for comparative verification.Automatic extraction took about two and a half minutes in total,which significantly improved the efficiency of evidence extraction compared with manual extraction.In addition,the accuracy of automatic extraction content was better,but the integrity of extraction content still needed to be improved.4 ConclusionThe TCM RCT evidence collection ontology can provide a reference for the standardized processing of TCM RCT evidence,also help to process,reuse and share information,realize the knowledge integration and knowledge representation in the field of evidence-based TCM,and provide more intuitive and accurate TCM RCT evidence knowledge for the makers of evidence-based guidelines and researchers for use.The makers of the auxiliary guide for automatic extraction of RCT evidence of traditional Chinese medicine extract data,improve the efficiency of manual extraction,provide RCT information reference for the construction of clinical problems and evidence extraction stage of clinical practice of traditional Chinese medicine,and maximize the use of resources. |