Font Size: a A A

Optimization Research On TCM Datasets Classification

Posted on:2017-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2284330488970023Subject:Chinese medicine
Abstract/Summary:PDF Full Text Request
Traditional Chinese Medicine, called TCM for short, is the precious wealth of the Chinese nation, and it made important contributions to the multiplication and prosperity of the Chinese nation. TCM have had a far-reaching impact on the progress of the world civilization. Informatization of TCM is an important part of our population health information and development of TCM, the party and the country attaches great importance to the construction of TCM information. On the 60th anniversary celebration of the establishment of China Academy of Chinese Medical Science, Chinese President Xi Jinping and prime minister Li Keqiang personally asked continuing to promote the development of TCM information construction, while we should effectively inherit and develop TCM that ancestors left us precious wealth.With the rapid development of TCM, our TCM industry of agencies, departments generated and accumulated the massive scientific datasets and business datasets, and they could not get effective sharing and utilization, which restricted the progress of TCM scientific and technological innovation and socio-economic development. In this background, facing the needs of social and technological development, China has started TCM scientific data sharing and service project, and it aimed at our country’s TCM science data resources to provide our users with data sharing service through a unified platform.TCM informationization speed up the development process, and business domain of TCM data set is gradually enriched. Part of TCM hospitals construct information platform based on electronic medical records. The growing level of TCM science and technology and education informationization is increasing, and they built the foundation information database in TCM science and technology and TCM scientific data management and sharing service centers. TCM colleges and universities have constructed TCM digital libraries and digital museums. TCM business areas accumulated a large amount of data sets, more strongly prominent the characteristic of TCM information management needs.Many experts and scholars previously engaged in the field of TCM data set has already done a lot of classified research, but they are aimed at the scientific data, not yet carried out the research in the TCM business domain. With the rapid development of TCM industry, currently, various TCM agencies or system data sets are mostly formed by daily work process, business activities, and clinical actual data sets, these data sets are difficult to be assigned to the previous datasets classification code table in the specific business projects. Therefore, this study is the analysis of the existing problems and reasons of TCM datasets classification, to optimize the classification on the basis of TCM datasets research of the classification and code.The purpose of this study is to provide TCM data resources better and more comprehensive reference tool, to achieve the purpose of data sharing. The administrative department of TCM can use this study as a whole of TCM resources planning, and easy to manage data resource integrated and unified division. TCM information system developers can use this study as an important reference in data structure and database design. TCM data center resources can use this study to classify management TCM data resources. The users who need TCM data resources can use this study convenient search, browse, query needed data resources, promoting the efficient utilization of the data resource and convenient information navigation.The object of this study is TCM datasets. Dataset refers to a certain theme, it is a data collection, and it can be identified and treated by a computer. Business domain and subject these two classifications angle which mentioned in this study, the paper further elaborated business domain refers to business information relating to TCM domain range. It includes all areas of TCM business workflow and activities. Business domain data is TCM business working data set. Subject is the academic concept which linked knowledge, it refers to a certain field of science or a branch of science. Subject is the classification system of scientific knowledge, and different disciplines have different scientific knowledge. Business domain and subject are inherent unity, and they are complementary relationship.Through analysis of existing TCM datasets classification problem and reasons, we found in the original table, there are some categories do not correspond to the appropriate datasets, and some categories of datasets too much or too little. The classification code table is relatively unbalanced. Chinese international exchange and cooperation of TCM is not prominent, and whether national medicine should be integrated into traditional Chinese medicine. After analyzing the reasons, we adjustment to the original table, and finally form TCM datasets optimized classification system.In this study, the two parallel main line is the analysis business management domain and test validation to the literature data sets. It mixed faceted classification and line classification together, and combined with business domain and subject, it regards as the basic framework for the classification, and TCM data resources are classified optimization. Using faceted classification to consider the field of TCM business activities existing data attributes or characteristics as a number of "faces", in each "face", line classification respectively in accordance with disciplinary system or the unique attributes of object classification. The object is classified into several level layer categories, and arranged in a hierarchy and the drill-classification system. In this classification system, there is a parallel relationship between category of coordinate classes, and there are affiliations between category of upper class and subclass, with category of coordinate classes do not repeat, and do not cross.In order to carry out research, through literature research and expert consultation, we retrieve and collect existing presence TCM database. We retrieve 3,770 documents about TCM datasets on Wanfang Med Online and retrieve 13,939 documents on China National Knowledge Internet, and retrieve 32,400 documents on Baidu Wenku. Each database website exists repeat literature. According to the defined inclusion criteria that we search "traditional Chinese medicine", "Database", "data platform", "information system", "platform", "data collection", "data warehousing" and other words of the document as the goal object. Document types include journals, monographs, conference records and summaries, reviews and so on. Exclusion criteria is the article literature which irrelevant to the purpose, and the unpublished articles, and the database with out the scope of TCM industry. Through collect and analysis of all these documents in TCM database covered, we select artificial and summary the database which met the inclusion criteria, and we selected 586 pieces of characteristic of TCM database. According to the existing data set classification category, the retrieved databases are trial classified. Through the literature dataset test analysis, we found there are four cases in the original classification data sets:no data set corresponding to the class, too much or too small data set, and the data set collation repeated. In addition, because the original table is completed in 2011, with the rapid development of TCM information, "traditional medical management," "Chinese medicine health care management", "traditional Chinese medicine exchange and cooperation" and other business activities are development rapidly, they have produced a large number of data sets, they play an increasingly important position in TCM industry, and they provide strong support for TCM information. Thus, by analyzing the original table classification categories, we found that the table has some limitations, and we need for further optimization of classification.The TCM datasets classification optimization coding table follow the basic principles of coding practical, scientific, systematic, etc.. Based on the theme of subject and business domain classification method, according to the business domain, the data resources of TCM is divided into five categories, which is TCM industry management, Chinese medicine, herbology, acupuncture and ancient books. According to "Six in One", TCM industry is divided into eight categories, which is Chinese Medicine Administration, Chinese medicine management, health care management, education management, research management, TCM industry, development of TCM and traditional Chinese Medicine international exchange and cooperation. Other categories are classified according to business domain and subject relating to adjust.According to business domain and subject relating to these classifications basis, through the methods and levels of content description, we verified analysis the table. Using the methods of included in the test and application test to verify the code table, which is practical, reasonable and workable. After experts discussion, we summarize the optimized code table’s features that is business domain and subject combined, and mix faceted classification and line classification together to discuss.Compared with the original table in the categories, the optimized classification combine TCM datasets with "TCM industry management", "Chinese medicine", "herbology ", "acupuncture" and "Ancient books" five categories. Optimizatied classification table made relatively large adjustment is "TCM industry" and "national medicine" two parts.We change the original table’s "TCM industry" into "TCM industry management", and the second categories has adjusted to 8 categories. We change "Chinese Medicine management" into "Chinese medicine administration". "Chinese medicine institutions and persons" is incorporated into "TCM Administration". "Instruments and equipment "is adjusted to "TCM industry" which in the" Chinese business". And we increase the "TCM exchange and cooperation" category, and segment this category. Optimization classification remove the "national medicine", the "national treatment" classified under "Chinese medicine" category, and "national herb" under "Chinese herbology" category, in this classification TCM actually refers to the whole TCM, including national medicine. According to business domain, "Chinese medicine" change six categories to "TCM basic theory" and "Chinese medicine clinical diagnosis and treatment" two categories. "Chinese herbology" change 9 second-categories into "Chinese herbal medicines" and "pharmacy" two categories. "Acupuncture" and "ancient books" are unchanged.Optimized datasets classification combined with business domains and subject, and combined with TCM business activities, daily work, clinical practice, themed, etc.. TCM industry management which based on "Six in One" is divided into 8 business domains. At the same time, according to Chinese Library classification, the State Council degree Office each subject classification code directory and other classifications, the optimized table classified TCM classification datasets according to subject, combined these two methods together, and mixed faceted classification and line classification. Including the whole TCM data resources, we used complex classification method to subdivided categories, so that more datas can be standardized to achieve integration and sharing.TCM datasets optimized classification code table should continuously enrich and strengthen in the future business activity practice. Although with the standardization of the deepening, TCM information management continues to improve, there will increase many databases in cross-sectoral cooperation, but TCM datasets classification method is relatively stable, it can add new categories on this basis if needed.This study is analysis to optimize study on TCM datasets, which on the basis of expert analysis in the relevant field of TCM datasets’ classification. TCM data resources are divided into five business domains. By setting the code system for coding, optimizing the formation of TCM datasets classification system code table. It is convenient for computer to recognition and for people to artificial processing. It is the precondition of information exchange and sharing for TCM hospitals, colleges and universities, research institutions, and other Chinese enterprises. It will provide strong technical guarantee for the departments of TCM administration to establish standard unified data platform. It provides data reference for TCM information system developers, and actives the rational allocation of medical resources of TCM, and enhances TCM industry’s macro-control ability. It has a great realistic significance for promoting the development of TCM industry.
Keywords/Search Tags:Business domain, Subject, TCM datasets, Classification
PDF Full Text Request
Related items