Font Size: a A A

The Study And Analysis Of Oracle Bone Inscriptions Based On Statistical Natural Language Processing

Posted on:2011-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y J HuangFull Text:PDF
GTID:2178330332467263Subject:Engineering
Abstract/Summary:PDF Full Text Request
The oracle bone inscriptions in Yin ruins, the earliest in found Chinese characters, are historical record in the late Shang Dynasty and the traceably earliest source of Chinese language, culture, history. It is to collect, sort, catalogue and research carapace-bone-script that develops into a new discipline----study of bones. It is no doubt that the foundation of the corpus of the oracle bone inscriptions can help researchers doing the assistant research by using of computers and accelerate the pace of research.In this thesis, setting the inspection and explanation of the oracle bone inscriptions, the author applies corpus and theory of natural language processing and related skills, on the basis of simply well-built corpus, to tag the part of speech in the oracle bone inscriptions, especially emphasize the realization of part of speech tagging which based on NLP technology, and try to establish the corpus of the oracle bone inscriptions, in order to achieve the goals of sharing the knowledge and assisting the inspection and explanation of researchers.The main work and the key technologies in this thesis are as follows:firstly, with the purpose of inspection and explanation for carapace-bone-script, the author puts forward the planning process of auxiliary inspection and explanation for carapace-bone-script by making use of computer, and makes use of corpus-related technology to process, analyze and form a simple corpus of carapace-bone-script, and set up a simple system of word segmentation and part of speech tagging to realize the structured information of corpus in carapace-bone-script. As for artificial parts, the author gives the semi-automatic word segmentation and part of speech tagging for the oracle bone inscriptions, and realize the structured information of corpus in carapace-bone-script. As for automatic part, the author analyze the oracle bone inscriptions by NLP technology which based on statistics, and achieve the simple automatic word segmentation and part of speech tagging in the oracle bone inscriptions. Finally, the author made a detail description about module structure and physical structure of the system that achieve the word segmentation and part of speech tagging in this thesis, and test the system. At the end of this thesis, there are some advices given on the application of intellectual corpus in oracle bone inscriptions, arithmetic design of informational extraction which based on marking information and systematic development for corpus will be achieved in later works.
Keywords/Search Tags:Oracle Bone Inscription, Corpus, Part of speech tagging, automatic word segmentation, Natural language processing
PDF Full Text Request
Related items