| The history of Traditional Chinese Medicine (TCM) is more than thousands of years.The data in this field has been accumulating in a massy amount and increasing at ahigh speed. Facing this huge treasure of TCM, many researchers of TCM Informationare endeavoring to develop medical language system to support electronic medicalrecords. In this paper, we make a further research on semantic relationships of ClinicalNomenclature of Traditional Chinese Medicine, and three new semantic relationshipsare founded.1 BackgroundDue to different purposes, many medical language systems are founded in recent yearsto support the development of hospital information system. Among these systems,three are famous, one of which is Systematized Nomenclature of Medicine (SNOMED)which is established by College of American Pathologists, the second one is UnifiedMedical Language System (UMLS) which is established by National Library ofMedicine, and the third is Traditional Chinese Medicine Language System (TCMLS)which is built by Institute of Information on TCM of China Academy of ChineseMedical Sciences.SNOMED, which is based on modern medical theory system, consists of conceptschart, descriptions chart and relationships chart. There are 41 semantic relationshipswhich are included in semantic type-linkage concept. The purpose of UMLS is to facilitate the development of computer systems thatbehave as if they "understand" the meaning of the language of biomedicine and health.There are three UMLS Knowledge Sources: the Metathesaurus, the Semantic Network,and the SPECIALIST lexicon. The number of semantic relationships in UMLS is 54.TCMLS consists of basic glossary and semantic system, which is designed for theclassification of the concepts of TCM, referring to UMLS. We add four newrelationships to UMLS, and get TCMLS's semantic relationships.Clinical Nomenclature of Traditional Chinese Medicine, focusing on supportingelectronic medical record, is designed by China Academy of Chinese Medical Sciences.The semantic relationship is included in semantic type-linkage concept which comesfrom SNOMED, UMLS and TCM Materia Medica Subject Headings.The semantic relationships of Clinical Nomenclature of Traditional Chinese Medicineand TCMLS come from western medical language system, so there are somedisadvantages in demonstrating the TCM relationships. At the same time, the two TCMlanguage systems have never been used for clients, so the research of theirrelationships should needs improvements in practice.2 Research ContentThe purpose of this paper is to study the semantic relationship of ClinicalNomenclature of Traditional Chinese Medicine. And there are five procedures, whichare addressing target, collecting medical records, extracting relationships, classifyingconcepts, and generating new relationships. The focus is on extracting relationshipsand generating new relationships.AimWe are going to extract explicit expression relationships in 3,000 medical records,aiming at improving and complementing the semantic relationships of ClinicalNomenclature of Traditional Chinese Medicine, and to find a feasible method of researching clinical semantic relationships.Collecting medical recordsWe comply with the related documentations of Ministry of Public Health, and selectednine subjects to extract medical records. The nine subjects are: internal medicine ofTCM, surgery of TCM, gynecology of TCM, paediatrics of TCM, acupuncture andmoxibustion, osteology and traumatology of TCM, dermatology of TCM, proctologyof TCM, ophthalmology and otorhinolaryngology of TCM.The medical records are from original Inpatient medical case, periodicals of CNKI(China National Knowledge Infrastructure) and TCM monographs. And the principlesare as follows: the time should be after 1980, and the time of the congruent records wecollect is from 1983 to 2006.On the other hand, the sources should be no more thansix ones in each subject. The percentage of each expert's records should be at most25%. In order to insure the abundance of information, the contents of the records aremade up of chief complaint, history of illness, syndrome differentiation, diagnosis andtreatment.We collect 2840 clinic medical cases and 160 inpatient medical cases which come from36 monographs, original medical cases and periodicals of CNKI. Each subject collects300 medical cases, while internal medicine is 600.Extracting relationshipsAccording to the property of relationship and characteristics of Chinese, we extractconjunctions, prepositions, verbs and make some rules: We only use the conjunctionswhich connect two sentences, because this kind of conjunctions is more significant inmedical field. We extract all prepositions which are involved in medical records. Toavoid subjective factors, all verbs should be with objects. Meanwhile, there should beat least two words in verbs or objects. One word may have different functions indifferent language space, so we have classified the words according to their differentmeanings.Classifying concepts After the extraction, we build the whole glossary. Then the conceptive systems whichare based on the similarities of the words, the meanings of themselves and thecharacters of semantic relationships in SNOMED and UMLS, are formed.Generating new relationshipsIn all the semantic relationships we extract, there are 36 ones involved in UMLS and20 ones in SNOMED. And the covering rates are 67.92% and 48.78% separately.Before generating new words, we should give them definitions. In the process of thedefinition, we should follow the principles that we should make the most of theauthority before we modify the definitions reasonably.The basic requirements of confirming a new semantic relationship are: veracity,applicability, generalization, TCM features and practicality. Basing on the fiverequirements, we build three new clinical semantic relationships of TCM.3 DiscussionFeatures of data distributionThe words we extract have some obvious characteristics in distribution. Firstly, thewritten style of inpatient medical case is more formal than clinic medical case.Secondly, the contents of inpatient case, especially of the part in syndromedifferentiation are more abundant than clinic medical case; Classical TCMrelationships are centralized in syndrome differentiation which shows the differentwords using in different subjects evidently.Distribution in UMLS and SNOMEDAs a result, we discover that the relationships in SNOMED and TCMLS contain all thewestern medical relationships we extract. While the three new relationships we built allcome from TCM. So we can see that these two kinds of semantic relationships inSNOMED and UMLS demonstrate the important relation in biomedical area perfectly.Result analyses The two series of relationships can not completely cover the relationships we extractbecause of five reasons. Firstly, TCM has different theory from western medicine.Although TCMLS has added four new TCM relationships, there are still somedisadvantages in demonstrating all the TCM relationships. Secondly, language systemsin TCM field have never been practiced, so we can find some new problems from apracticing point of view. Thirdly, we have only extracted explicit expressionrelationship in this paper, so we can not conclude that other relationships in SNOMEDand UMLS do not exists in TCM clinic. Fourthly, the relationships we have got in thisresearch are very detailed and this brings about some difficulties in getting valuableinformation from the enormous data. Finally, all the data of inpatient records are fromthe same hospital, and it will affect the objectivity of the whole data.A method of studying TCM relationshipUnder the design ideas of the research on national standardization, we take account ofthe needs of Clinical Nomenclature of Traditional Chinese Medicine. The detailedprocedures are as follows: we should analyze the demands of what we research, andthen make a reasonable way of generating new relationships. After confirming therange of collecting cases, we need decide a feasible method of extracting and recording.Then, we should build a glossary, and classified the concepts in the glossary. Finally,we should analyze the former relationships, give a definition and in the end generatenew relationships.4 SignificanceAt the present time, all the published books and documentations are all about thestandardization of concept entities, and there are no related documentations aboutsemantic relationships. And this paper has discovered three new relationships whichenriched the semantic system of TCM. To some degree, this paper has practiced theTCM language system which only rests on the theory, and at the same time we haveexplored a new way to generate new relationships in TCM clinic field. In the era of information explosion, we should pay more attention to the research ofhospital information system, and I expect that this paper is of help to enrich thesemantic relationships and promote the development of TCM language system. |