Font Size: a A A

Based On The Quasi-oral Measurement Of The Corpus Of Modern Chinese Studies

Posted on:2006-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y H LvFull Text:PDF
GTID:2205360155466763Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Based on the large-scale text corpus and the contrast of the quasi-oral and the written, different types of the quasi-oral, taking the transition form of the oral and the written— quasi-oral as the research object, the dissertation make a detailed dynamic depiction and analysis about the using status of character and word of the modern Chinese quasi-oral, on the foundation of which, we try to discover methods and rules to auto-identify to the style of different texts. The dissertation falls into five chapters. Chapter One: IntroductionIn this chapter the author summarizes the concept, characteristics, research method and meaning of the quasi-oral. Based on the contrast of the oral and the written, we define the research object of this dissertation "quasi-oral" as the oral that differs from the complete oral that is used in daily life, which takes on more man-made trait. The quasi-oral differs from the written and the daily oral, and it has its unique research meaning. The present research of oral mostly is introspective and on experience. The research method of this dissertation combined corpus-based method and rule-based method is pointed out. We try to find problems from corpus and analyze problems using the data from corpus. Chapter Two: The Establishment of Annotated CorpusIn Chapter two, the intention, principle, type and the distribution of text, developing procedure of the annotated corpus are expounded. Reviewing and refering the founded oral corpuses of modern Chinese, this dissertation put forward the concept of "corpus of quasi-oral" and introduce the condition of the establishment and processing of annotated corpus. Chapter Three: Analysis of Characters FrequencyAs the Chinese character accords with the syllable, the conditions of character can reflect the condition of syllable in oral. The cumulated frequency of high-frequency characters in oral is higher than that in the written. If we consider the one-syllable words of them, the proportion of pronoun is much more than others. As far as the syllable-structure is concerned, as a whole, the syllable structure of high-frequency character is simple. The frequency of the most high-frequency character "de" is lower and lower when the type of quasi-oral is nearing the daily oral. The number of character used in quasi-oral is about 2000. The average number of most lowest-frequency character in the six types is 589, which takes nearly 30% of the all. Chapter Four: Analysis of Words FrequencyThe proper nouns (name of people, place, brand, organization, etc.), numeral and English characters of different quasi-oral types, have different usingstatus and characteristics. There is few postfix in quasi-oral and numerous mistaken syncopation. The average covering-rate of the first 20 high frequency words in six types of quasi-oral reaches 27.71%. Most of the high frequency words are single-syllable words. The average rate of low frequency words which appear one to four times in total numbers of words has been high to 68.39%. From the high covering-rate of high frequency words and huge types of low frequency words, we could get conclusion that the usage of word of quasi-oral is simple. Chapter Five: Analysis and Discuss of Quasi-Oral CharacteristicsThe overlap-words, the postfix-words of "z/" and the series of words of "shuo" embody the oral characteristics from form, content and function, etc. The usage of postfix of "men" embodies different types of quasi-oral's oral-being degree. The usage of oral words is also one of the features of quasi-oral, and they should be judged by the words' corpus oral. Epilogue:In this part the author sums up the research in general and points out the shortcomings existing in this research. Then, the author brings forward some assumptions of further research.
Keywords/Search Tags:Corpus, Quasi-oral, Character Frequency, Word Frequency
PDF Full Text Request
Related items