Font Size: a A A

Research And Implementation Of Tibetan Digital Library

Posted on:2006-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:T LiuFull Text:PDF
GTID:2168360155962058Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As rapid development of IT technology with computer, communication and network, the construction and utilization of international information high way provide the environment and conditions for large scale information system including libraries. Currently, network information management, digitization technique and digital information resources construction have been becoming the focus of international competition, many countries are devoting tremendous resources for the research and development. As a result, the digital library (DL), a new concept and new scheme, is coming up, which is deemed as one of the major development of IT in 21st Century. Since 90 decade of past century, several techniques related to DL are getting mature and mature, such as OCR technique, full text retrieval , international coded character set (Unicode) and metadata standard (Dublin Core), that provided strong support for DL speedy development.At present the implementations of Tibetan Digital are based at CodePage, they coverd the Tibetan Encoding area to the Chinese characters Area. Then it is hard to carry out Tibetan and Chinese parallelled searches, to come into being the Tibetan encoding searching pilot system adapting to Windows OS. As far as our concerned, The Tibetan Digital Library Pilot System was established with solving many problems.This thesis discusses the key techniques to DL, entity resources digitization, metadata marking up, and full text retrieval in details. Within the framework of international standard (ISO/IEC 10646 or Unicode), considering the features and writing rules , for Tibetan and Mongolian script, according to DL development flow, the architecture of Tibetan digital library is described. Especially, the technical solution for Tibetan in two deferent encoding schemes under the same ISO framework is addressed in depth. The paper introduced a searching method for Tibetan in two encoding schemes by mean of THESAURUS retrieval. To be brief introduction to practicing project with the preferred encoding mode.The international standards , Unicode, Dublin Core, and XML are also introduced, the project behind the thesis is strictly based on the above standards.The project behind the paper is intended to provide a conceptual pilot system, hopefully, could be used as reference for other minority language DL.
Keywords/Search Tags:Unicode, OCR, Dublin Core, XML, Digital Library, Tibetan Encoding, FulI Text Retrieval
PDF Full Text Request
Related items