Font Size: a A A

Construction And Evaluation Of Semantic Relations Among MeSH Of Organisms Category

Posted on:2009-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2178360242991281Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
ObjectiveThe fast increasing volume of biomedical literature and the availability of web periodicals and full-text databases makes text mining required and feasible. Text mining is the discovery by computer of new previously unknown information, by automatically extracting information from textual documents, and it involves information techniques, textual analysis, statistics, nature language process, machine learning, and many other knowledge. The aim of this study was to mining semantic relations among medical subject headings (MeSH) of organisms category.Materials and methodsThree literature levels were selected when searching MEDLINE: microcosmic level, middle level and macroscopic level. Ten last level medical subject headings were chosen in the microcosmic level, all allowable qualifiers were chosen in the second level, and the category subject heading alone "organisms category" was chosen in the third level. Then high frequency major MeSH terms were calculated among related articles for co-word clustered analysis, and MeSH pairs were extracted as rules under description and evaluation. Articles that containing the pair of MeSH terms were read and the relationship of the subject headings was judged. Semantic relations in Universal Medical Language System (UMLS) were adapted to link two MeSH terms that formed a association rule. In order to limit quantity of rules, medical subjects headings were substituted with subclass number of MeSH tree structure. All rules together made rules base. 60 articles published in Zhonghua Yixue Za Zhi in 2005-2006 as test corpora which was mined both by experts manually and by MeSH_Manager automatically. Sensitivity and precise were calculated.ResultsAt last 194 association rules were extracted: 37 from microcosmic, 141 from middle, and 16 from macroscopic. The relations more than 3 times extracted by experts were regarded as accepted rules, and 40 rules from 29 articles were obtained accordingly. When the MeSH terms substituted with their subclass codes, 24 subclass rules were obtained. 18 rales from 11 articles returned after inputting the testing corpora into the MeSH_Manager system in XML format. Comparing the 40 accepted rules to the MeSH_Manager returned rales and resulted in 8 entire matches, 17 part match and one imparity, the sensitivity of entire match and part match were 20% and 35% respectively while the precise of them were 44% and 94% apart. MeSH_Manager system extracted 18 of the 24 manual subclass rules when comparing the subclass rules, so the sensitivity was 75%.ConclusionsMining and evaluating relations among medical subject headings through co-word clustered analysis provided text mining with a new attempt. Firstly, it's a feasible and credible method to extract association rules through co-word clustered analysis and to represent association rules using UMLS semantic relation language. Secondly, sensitivity and precise will improve when the rule base be extended and the UMLS semantic relation understood better.
Keywords/Search Tags:Text mining, rule base construction, knowledge representation, MeSH terms co-occurrence analysis, hierarchical cluster, rules evaluation
PDF Full Text Request
Related items