Font Size: a A A

Research On Knowledge Indexing And Concept Retrieval Based On XMARC Information Description

Posted on:2005-12-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:L C WangFull Text:PDF
GTID:1118360122971092Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
MARC (Machine Readable Catalogue) is a kind of metadata format. It has the tremendous advantage in information description, storage, exchange, standard and precision. Owing to the extensive application in domestic and international libraries, it is necessary that MARC will exist and develop continuously after the development of more than 30 years. Because there are several limitations in professional skill and difficult description and complex structure, MARC cannot satisfy the order demand of magnanimity information resource on current network. Research on China Archival MARC and its computer information processing system on network is the key problem and must be solved in the area of China archival. But it is still blank in our country. The description and realization of MARC concentration information on these fields have not studied further in domestic and international. Now it has become current urgent needs to develop the integrated and the share cultural and knowledge property of catalog information.The theme retrieval is based on subject indexing. It is the inevitable tendency of modern development in present libraries, archives and information centers. Studying and developing theme knowledge is the one of direction and necessity of information retrieval software or Internet search engine. It will be used to realize the concept retrieval by classification and the knowledge conversion of the morpheme. The data extraction technology of theme automatic indexing is nearly impossible to solve from foundation. But thisresearch has been got the solicitude of people always in long time. The research and handling for theme concept indexing and retrieval in domestic and international stay mainly on the theme word level. It has become the one of major bottleneck for Chinese information databases to be built and used.The research of this paper is spread out mainly around 5 aspects: (1) The description of XMARC information theory system based on XML under network environment; (2) The advanced design of field XMARC metadata; (3) The foundation of K-S-C special Knowledge relation by the identification "Keyword + Subject + Category"; (4) The establishment of XMARC theme knowledge automatic indexing and its algorithms; (5) The research on knowledge processing method of concept retrieval and the theme classification based on XMARC. Specific work is as follows;1. Information concentration description theory based on XMARC has been proposed and studied. Two DTD (Document Type Description) types of XMARC based on field content and frame are designed in detail. The core elements of concentration XMARC and the realization of its XML Schema method have been defined and researched.2. The semantic relation of K-S-C theme concept have been put forward and founded. They are used in XMARC text automatic indexing. The ambiguity of automatic indexing has been decreased for special stop-words to be got pretreatment. The time of automatic matching has been shortened by shortest word pushing method. So Maximum Matching (MM) algorithm of automatic indexing has been improved in specifically application field.3. The concept retrieval method based on unionization has been put forward and studied. XMARC theme classification has been researched anddesigned by the theme category index. XMARC theme retrieval has been researched and designed by the theme hierarchy index and semantic morpheme knowledge. These methods have raised the quality of field theme information retrieval.Major work as above has been tested and verified by way of some experiments. The meanings of this paper lie in the development and research of Chinese libraries and Chinese archives by through the theoretical research on XMARC. It will produce the marked effect for XMARC concentration information to be researched and practiced. The new solving schema of theme classification and the subject indexing of Chinese information has been explored through excavating XMARC theme information voluntarily. The practicality of the network information...
Keywords/Search Tags:Thematic Concept, MARC Metadata, Information Retrieval, Knowledge Indexing, Chinese Language Processing
PDF Full Text Request
Related items