Primary Research On Archaic Chinese History Corpus Construction

Posted on:2012-02-19

Degree:Master

Type:Thesis

Country:China

Candidate:W R Song

Full Text:PDF

GTID:2268330425997275

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

How to exploit computers in processing natural language to analysis, synthesis and translate content of natural language materials, is very important both in theory and in practice. It becomes more important in the Internet era.In NLP (Natural Language Processing), the statistical model based on large amount of real corpus evolves rapidly and presents good performance. Consequently, the construction of corpus bank becomes the basic work of NLP.In this thesis, we discuss the general process of archaic Chinese history corpus construction:the choosing of corpus, the selection of coding, the purification on character layer and the purification on sentence partitioning layer, etc. We present the general algorithms in the processes from web document to the clear and sentence-partitioned primary corpus. Besides these, we discuss the design of the query function on corpus, and present the design and implementation of some key algorithms and data structures. On the basis of this work, we developed a set of applications for corpus construction and constructed the Comprehensive Mirror(ã€Šèµ„æ²»é€šé‰´ã€‹)corpusã€‚...

Keywords/Search Tags:

Corpus, Corpus construction, Archaic Chinese, History, Query

PDF Full Text Request

Related items

1	Research On The Construction Of Lao-Chinese Bilingual Corpus System
2	Construction Of Chinese Email Corpus
3	Design And Implementation Of Automatic Construction System Of English-chinese Parallel Corpus
4	Building And Evaluating Special Domain Comparable Corpus
5	The Construction And Research Of Chinese-uyghur Bilingual Comparable Corpus Automatic Acquisition System Based On Machine Translation
6	Chinese And Vietnamese Bilingual Corpus Construction Based On Python
7	Tibetan-Chinese Bilingual Parallel Corpus Construction Method And Key Technology Research
8	The Research And Construction Of A Chinese Semantic Corpus
9	Corpus Construction And Research For Hedges Detection In Chinese Wikipedia
10	Chinese Grammar Corpus System Design