| Carapace-bone-script, characters used to record and divine by tortoiseshells or bones of beasts in later Shang Dynasty, are systematically earliest characters in found characters of China, traceably earliest source of Chinese language, culture and history. At the moment, it is to collect, sort, catalogue and research carapace-bone-script that develops into a new discipline----study of bones, which based on the system of describing characters and forms in carapace-bone-script. Meanwhile, with the involvement of digitalizing means, the pace that researchers study carapace-bone-script by computers will be accelerated. Therefore, it will be significant for the inspection and explanation of carapace-bone-script to set up the corpus in which store the characters and forms of carapace-bone-script.In this thesis, setting the inspection and explanation of characters and forms in carapace-bone-script from researchers for examples, the author applies the corpus, the theory of support vector machines and related skills, on the basis of simply well-built corpus, to analyze carapace-bone-script from the structure of characters and forms, set up a corpus which based on the structure, and mine the data about structure in corpus of characters and forms. For instance, in this thesis, the author classified radicals by using related knowledge of support vector machines to achieve the goal of sharing the knowledge and assisting the inspection and explanation of researchers.The main work and key technologies in this thesis are as follows:firstly, on the purpose of inspection and explanation for carapace-bone-script, the author puts forward the planning process of auxiliary inspection and explanation for carapace-bone-script by making use of computer, and makes use of corpus-related technology to process, analyze and form a simple corpus of characters and forms in carapace-bone-script, meanwhile, dismantles semi-automatically component-based characters of carapace-bone-script, and forms the numeral expression whose main literal characteristics are component, component orientation, component layer, which can be processed by computers easily and directly; Secondly, through the technology of the support vector machine which based on statistics, the author completed the analysis of characters and forms in carapace-bone-script, to set up and realize the simple system of analysis and classification on the similarity of characters and forms in carapace-bone-script, and finally test the system. The result of the test shows that good results are realized because of the method used in the system, and expected goals are achieved. |