| Recently collocations and corpora have aroused wide concern among researchers, educators and even learners. To Firth whom the term is attributed to, a collocation is defined as a mode of meaning; for Halliday and Hasan, it is viewed as a cohesive technique; and with McIntosh, it is treated as a mode of style. Now as far as Sinclair, together with his COBUILD corpus project, is concerned, a collocation is the embodiment of the dominating idiom principle in actual language use. Posing a great challenge to the traditional argument on the creative and arbitrary nature of languages around 1950s, these Neo-Firthians'works, esp. corpus-based researches, have greatly enriched the study of collocations.As in China, although vocabulary teaching has been given weight during the last two decades, there seems to be a lack of awareness of the importance on collocations. Either teachers or learners tend to emphasize much on learning words in isolation and memorizing their lists of meanings, which to some extent lead to learners'collocational errors in composition writing. Hence, this paper sets out to explore learners'application situation of six collocation patterns, namely, noun + noun [cc1], noun + verb [cc2], verb + noun [cc3], adjective + noun [cc4], linking verb + adjective [cc5], and adverb + adjective [cc6], under the assistance of 540 2004 & 2005 TEM-4 essays randomly collected from the up-to-date large-capacity English Sub-Corpus in the Corpus of Chinese Foreign Language Learners. The main purpose of the present corpus-based collocation research is to provide a possible answer for the following three research questions: (1) Is English majors'collocation competence correlated with their general writing ability in TEM-4? If so, is the correlation significant? (2) What are the reasons for English major students'employing improper collocations in TEM-4 writing? (3) With the knowledge on learners'developmental pattern of applying collocations, what kinds of improvement is needed in vocabulary teaching with the aim of enhancing English majors'collocation competence?In compliance with the three tasks, the whole paper is divided into six chapters. The first part states a brief introduction to the research questions, the origin of view, the need for the present study, and the structure of the dissertation.The second part is designed to clarify some concepts relevant to collocations and to give a general view on some literature of collocations. Among these studies, the internal lexicon hypothesis is given priority since it lays a theoretical foundation for later analysis on collocational errors as well as for corresponding pedagogical implications. Last, such a working definition and classification of collocation is listed for this thesis on the ground of the previous literature review that a collocation is a conventional syntagmatic association of a string of lexical items which co-occur in a grammatical construct with mutual expectancy greater than chance as realization of meaning in text.The third part offers a theoretical framework of corpus-based researches and data-driven learning, followed by a concise description of the present English major corpus, the TEM-4 and some native or learner corpora. In this chapter, importance is attached to the exhibition of the data-driven learning approach since it is considered to strike a balance between the product approach and the process approach and it also forms a basis for later analysis on pedagogical implications.The following two parts deal with the research methodology, the results and the discussion. Following the reiteration of research questions and the illustration of research design in Chapter Four, Chapter Five, data analysis and discussion, is of paramount significance. With the intention of providing answers for the 1st and the 2nd research questions, the fifth chapter comes up with conclusions as follows: (1) The general data description reveals that the ratio of improper collocations, esp. cc1, cc3 and cc4, is relatively high when compared with other mistakes, while a comparison of the two years'data shows that 2004 TEM-4 participants have much better performance than 2005 examinees in both general writing competence and collocational ability. (2) The correlation coefficient between incorrect collocation frequencies and essay scores are negative but not to a significant level, whereas the correlation between collocation errors and other errors is quite strong. Statistics also reflects that in average one about-200-word composition may include one to three collocational errors. (3) Four interwoven factors are discovered to explain the collocational mismatches, namely, the confusion among words of similar pronunciations or within one word family; the influence of L1 thinking patterns which become illogical in English; improper semantic comprehension caused by L2 negative transfer such as synonyms confusion and semantic overgeneralization or by L1 transfer like literal translation and rule restriction ignorance.Part Six, conclusions and suggestions is aimed not only to supply a summary of the above finds and a suggestion for further study, but also to illustrate practical implications for English vocabulary teaching. First, a preliminary data-driven learning sample is revealed on what should be present in teaching for raising learners'collocational consciousness. Second, with the help of internal lexicon hypothesis, pieces of advices are provided on how to use the material to make learners notice and efficiently acquire collocation knowledge.As repetitively emphasized in this paper, collocations are of grand pedagogical values in language acquisition. And it is hoped that this research may to some extent contribute to the improvement of English language teaching and of learners'writing competence as well. |