| This paper bases on corpus linguistics, using a statistic method to study wordcollocation in life science materials (gene news). The purpose of this study is to extractcollocates of the node "gene" and "genes" and do a linguistic analysis on the differentword forms and collocation rules.The signification of this study comes in two ways. First, it reveals rules of naturallanguage generation and processing. Second, it provides materials and research methods tothe study of language combining with computer.This paper adopts the concurrent definition on collocation. It implies that wordcombinations, as long as the items' T score and MI score reach a certain remarkabledegree in statistic, can be seen as collocation. The research follows the following fourprinciples:1) Using the natural materials;2) Combining the quantitative analysis with thequalitative analysis;3) Using lexis-centered approach;4) Aiming at phrase-oriented. Thecorpus materials are selected from110gene related news with total77973words in Naturein the last decade. Two kinds of software are used–Wconcord and Excel. Wconcord is forthe extraction of collocates and Excel is for the calculation of T score and MI score. Tscore is used to determine the proper collocations. This paper sets the T score of propercollocate as above2. And MI score (Mutual Information) uses to study collocationalstrength and this paper sets as above11.The study result shows:1) collocation study based on computer processing can revealthe mutual restrictions and mutual attractions rules in natural language processing;2) thestudy can be useful in helping readers to build collocation frame and provide materials andmethod to computational lexicography;3) the research method can be applied tocollocation study in small corpus;4) the result indicates that the use of statistic method incollocation study has some limitations;5) many collocations are hard to understand, butthey are still some collocations with which people are familiar. Analyzing the gradationof collocates in corpus research can help readers and learners build a re-categorization re-categorization process and understand the unfamiliar collocations. |