| Objective:Traditional Chinese medicine(TCM)has certain advantages in the treatment of ch ronic diseases,intractable diseases and untreated diseases,but has always been criticize d for lack of scientific theoretical support and efficacy data support.The clinical pract ice of TCM focuses on regulating the overall health status of patients,rather than foc using on the common clinical indicators of Western medicine,and its improvement of patients’ physical quality and life quality is difficult to be scientifically evaluated unde r the current system.As a result,we proposed research and development of TCM and the use of digital research platform to meet the basic needs of the modernization of traditional Chinese medicine,in the real world advantages of traditional Chinese medic ine disease specialist new research paradigm to explore,to realize the real world in p atients with TCM characteristic of collecting the data of comprehensive health records and records,and form the advantage of the corresponding disease specialist efficacy e vidence system,Provide technical support and methodology reference for related resear ch.Endocrine diseases are chronic systemic metabolic diseases,common types include diabetes and thyroid-related diseases,and their diagnosis and treatment ideas are relat ed to individual life cycle status.TCM has certain advantages in treating such disease s.As modern medicine for part of thyroid disease with observation method or surgical intervention,only to "internal treatment+external treatment of traditional Chinese m edicine treatment for regulating the body’s overall state,can better improve the quality of life of patients,but the problems is the lack of a complete collection system and t he dynamic tracking system record subjective and evidence of traditional Chinese medi cine treatment of thyroid disease,As a result,TCM clinical research has always been criticized for lack of scientific theoretical support and efficacy data support.Different from thyroid disease,which is in the initial stage of scientific research with TCM cha racteristic information,the research paradigm of type 2 diabetes has been very mature,with at least 10,000 research-based literatures published every year.In addition,the r esearch group has carried out a large number of RCT literature extraction and mining for type 2 diabetes,and accumulated profound literature knowledge,which can provide ideas and reference for clinical research on TCM treatment of thyroid diseases.There fore,this study aims to establish a digital scientific research platform of TRADITION AL Chinese medicine on the basis of literature mining,and finally realize scientific co llection of evidence of curative effect of traditional Chinese medicine on thyroid disea ses.The main steps of this study are as follows:(1)based on the research of 63186 Chinese article RCT database,statistics and machine learning method is used to dig the existing clinical randomized controlled doc ument(RCT)of traditional Chinese medicine in TCM diagnosis and treatment informa tion units in order to understand the real world clinical scientific research advantages and disadvantages,based on the specialized scientific research platform of traditional C hinese medicine doctor end data collection standards.(2)Build a research platform for TCM specialties.This part of the study was bui It on the basis of the research results from the literature,and a custom research modu le was selected to design a real high-dimensional data collection system.The platform was built based on the design principles of improving data utilization and minimum o perating volume.①The patient side uses online applets for follow-up visits,systematic ally collects high-dimensional TCM treatment information(e.g.,subhealth information s uch as TCM constitution,lifestyle habits,emotional state,etc.),adds variable factors r elated to specialty diseases,and establishes users’ personalized health records.②The d octor side relies on the offline specialist research system as a platform to collect realworld data in a standardized and efficient manner,and learns the collected actual clini cal information with the help of machine learning method training,calculates the cosin e distance between keywords to obtain the cosine similarity,and finally forms a stand ard real-world TCM specialist terminology database.③The operation page of the speci alty platform can assist doctors to fill in TCM treatment information efficiently by ad ding intelligent guidance functions,displaying contents according to the cursor movem ent position and intelligent sorting of option information,reducing the number of click s and improving work efficiency.④The platform integrates dynamic specialty scale to collect patient specialty information in real time,realizing the collection of patient preconsultation screening and post-consultation follow-up information in full time,and co mparing and analyzing the results of previous visits to facilitate doctors’ timely adjust ment of treatment plans.⑤ We compared the system operation time,content inclusion degree,and completeness of information filling with the existing TCM outpatient syste m in three aspects to evaluate the performance of the specialty platform.(3)Real world scheme design and platform application.Taking the characteristics of thyroid diseases as the entry point,the real world research design of thyroid diseas es in TCM is innovatively carried out from the following aspects,which are different from the traditional research paradigm:① There is no experimental inclusion object,a nd patients in thyroid specialized departments are widely included;② It is not limited to the commonly used indexes in traditional thyroid specialty,but also increases the systematic collection of subjective phenotype and quality of life information in TCM;③Through high-dimensional information,training and learning,a real world terminolog y database is formed,and a syndrome prediction model is established to provide scien tific methods and ideas for TCM syndrome differentiation and treatment.The main results are:(1)Literature mining as a platform to establish data standard results:Firstly,we e xtracted evidence from 63,186 RCTs on 12 TCM dominant diseases as model diseases and constructed a TCM clinical evidence warehouse containing 63,186 TCM RCTs.T he results showed that:① the combination of TCM and Western medicine as the inter vention group,almost no single TCM therapy as the intervention group;②The effecti ve rate of key indicators is vague and the outcome indicators are limited;③The samp le size is small and the information collection dimension is low;④ The follow-up pe riod is short and easy to fall off;⑤ Lack of life-cycle health information collection a nd dynamic tracking;⑥ More unstructured information;⑦ There is a general lack of"TCM syndrome type","TCM diagnosis","TCM outcome index",and the existing RC T research and literature are also lack of quality.These data research results provide t he basis for the establishment of data standards for the optimization of platform desig n.In addition,in the data standard establishment stage,5091 TCM diagnosis and tre atment information of type 2 diabetes were mined under the guidance of the research paradigm of type 2 diabetes.73%of literature studies mainly focused on the absence of subjective symptoms of TCM,tongue symptoms,pulse symptoms,types of TCM s yndromes,Chinese herbal medicines,Proprietary Chinese medicines and acupuncture p oints.27%of literature diagnosis and treatment information was relatively complete,b ut other non-medical professional information was also lacking.At the same time,the design-grouping,observation index and outcome of RCT experiment are not in line wi th the actual situation of clinical research.Based on the deficiencies found in the above database mining results,the followi ng real world data collection criteria were preliminarily formed in this study:① Struc tured TCM tongue,pulse and subjective phenotypic symptoms;② Adding TCM consti tution information and TCM specialty scale to reduce subjective bias to quantify TCM syndromes;③Build a real world glossary through machine learning;④Expand the r ange of high-risk factors(such as living habits,mental state and other information),a nd collect high-dimensional information as comprehensively as possible;⑤Intelligent fi lling with priority of history options to improve the completeness of information fillin g;⑥ Optimize doctor-patient information collection methods to improve patient compli ance;⑦ Intelligent warning prompt drug dosage;⑧ No group regardless of disease c ollection,longitudinal dynamic tracking.This part of the content published SCI 1,1 e xternal review.(2)In addition to pre-diagnostic data mining,we also set up a specialized quality assessment system based on the structure of the existing RCT literature,and systemat ically evaluated 2,776 papers on six types of diseases,including diabetes,and found t hat many studies lacked TCM concepts in describing and reasoning about diseases,sel ecting and grouping participants,and reporting outcomes,and had much information m issing in terms of randomization methods,blinding,and pre-calculated sample sizes.In this paper,while addressing these deficiencies,the design and development of a platform was completed based on the data criteria identified by RCT mining:①Th e digital diagnosis and treatment scientific research platform has now completed the fi rst version of the design Version 1.0(https://81.70.174.192/)and 7 patents are being gr anted(including page display,automatic identification,intelligent prescription,historical symptom tracing and comparison,dynamic specialty scale,early automatic screening method,and dynamic structured real-world TCM specialty disease library).②Using effi ciency test feedback in terms of operation clicks,operation time,completeness of infor mation filling and content inclusion,it is found that this system has 6 fewer operation clicks(from 56 times to 50 times on average),2-3 minutes faster operation time(from 10 minutes to 8 minutes on average),58%higher completeness of information filling(f rom 17%to 75%on average)and 22%higher content inclusion than the existing TC M outpatient system(from 58%to 80%on average).③The specialist follow-up mini-p rogram has obtained 6 patents and registered nearly 1000 people.From the results of the establishment of the platform,it is now better to realize the holographic,efficient,intelligent and humanization of high-dimensional data collection in the whole process of real-world research.(3)Preliminary analysis results of the application of the digital diagnosis and trea tment research platform with thyroid specialties as an example:According to the probl ems of evidence quantification in RCT mining results in Chapter 2,Chapter 3 and Ch apter 4,we collected qualitative and quantitative TCM evidence information in a comb ination of self-assessment and other assessment in the platform design,and collected a s comprehensive information as possible from patients(TCM constitution,psychologica l state,living habits,etc.)in a high dimension.1000 data collected by the platform we re briefly analyzed to provide reference values for the implementation of subsequent r eal-world research protocols.The results of data analysis showed that:the types of sy mptoms of thyroid diseases contain 11 types of symptoms,and the machine learning modeling analysis was performed for liver-depression and spleen-deficiency symptoms.The decision tree of liver-depression and spleen-deficiency symptoms had the highest AUC value,which indicated that the stability of the decision tree model was better;t he accuracy of the decision tree could reach 1.00,and other indicators were above 0.75.The AUC value of K-nearest neighbor for liver-depression and spleen-deficiency evi dence was high,but it was only 0.62,and the accuracy could reach 1.00.From the model prediction results,we can see that the decision tree and K-nearest neighbor mo dels are more suitable for evidence modeling.The above experimental results show that this project takes TCM clinical RCT lit erature mining as the entry point,and cross-applies medical,statistical,and computer method techniques on the basis of forming data collection standards to better corrobor ate the theoretical basis of TCM.Based on the principles of three consecutive researc h links:data collection,data analysis,and theory validation,the design concept,usage scenarios,and application values of this platform are proposed,and finally a specialize d scientific research platform applicable to TCM treatment of thyroid diseases is initial ly established.The advantages of this platform include:①standardized and efficient collection of information on the characteristics of the dominant diseases in TCM by combining onli ne and offline methods;② establishment of a high-dimensional life-cycle health profil e of individuals by collecting high-dimensional information of the study subjects;③ex ploratory analysis of the correlation between disease evidence and influencing factors b y using unsupervised learning methods.The above research results will provide technic al support and methodological reference for the research of related diseases. |