Font Size: a A A

Study On Recognition Of Printed Tibetan Script

Posted on:2013-08-12Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhouFull Text:PDF
GTID:2248330374467141Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With a long history, Tibetan language is widely spread. Standard Tibetan is spoken in China’s Tibetan-inhabited areas which span southwest of China, including Tibet Autonomous Region, Sichuan, Qinghai, Gansu and Yunnan Province, and some Mongolian also speak the language. In addition, there are speakers of the language among Tibetan people who live across a wide area of eastern Central Asia bordering the Indian subcontinent except China. Numerous literature and ancient books are written in Tibetan language, they play a very important role in the heritage of outstanding Tibetan culture. Therefore, digitalizing and Informational of Tibetan character is a very meaningful work.Tibetan character is a typical vowel attached language. Traditionally, Tibetan language has30consonant characters and four vowel characters. It is suggested that researching on Tibetan word should put emphasis on their geometric configuration and the pattern of their structure in order to establish the theory and methodology for decoding of Tibetan word.This paper presented an approach to recognize the Tibetan by analyzing the structure and geometric features of the Tibetan writing. Firstly, the system uses a tool to generate the Tibetan word image, and then obtain a single character image by de-noising, thinning and dividing operation. Then based on the structure of the Tibetan word which includes length, baseline location information and etc, this system can divide a single word into its vowels and consonants. Secondly, through the feature extraction, this system will get the endpoint, the Euler number and other useful characteristics information and then will determine and recognize these basic characters by their basic character repository. Finally, according to the Latin translation rules, this system is able to translate all recognized words into the form of Latin in order to achieve the purpose of identification and recognition. According to our experiment, the accuracy of this approach can reach as high as89%.Printed Tibetan character recognition system is the use of computer to simulate the behavior of human brain to recognize Tibetan word which contains certain mature technologies such as image understanding, computer vision, and software engineering. This system will analysis and explore the Tibetan words by researching on the information technology GB22323-2008, and will also direct the trend of further Tibetan word research.
Keywords/Search Tags:Printed Tibetan recognition, Character recognition, Feather extraction, Image understanding
PDF Full Text Request
Related items