It exists clear semantic gap between sentence and discourse when computer tries to analyze the nature language text. And sentences group is a grammar unit between sentence and discourse. It reduces the adverse effects caused by the semantic gap effectively. So how to realize sentences grouping becomes the focus of study. At the same time, with the development of Internet and the growing number of electronic text documents, how to get the information that people are interested in automatically is badly needed. And automatic abstracting becomes popular in nature language processing.According the former considerations, the dissertation accomplished the following work:Firstly, this dissertation elaborates the present situation in the domain of sentences grouping and automatic abstracting. Moreover, it summarizes the basis of sentences grouping in detail. Then the method based on hierarchical clustering or hierarchical network of concepts(HNC) for sentences grouping are reviewed and analyzed. At the same time, the dissertation introduces the key technology including the method to realize text’s vector representation and text clustering.Secondly, this dissertation proposed an automatic Chinese sentences grouping method based on multiple discriminant analysis. The proposed method can solve the problems in Chinese sentences grouping domain, including the lacks of computational linguistics data and the joint makers in a discourse. Moreover, sentences group was rarely considered as a grammar unit. According to the theories of Chinese sentences group, firstly, this dissertation constructed an annotated evaluation corpus. Secondly, a group of evaluation functions J are designed based on the multiple discriminant analysis method. And the discourse makers are taken into consideration to get better performance. The result shows that the proposed method has better grouping performance than that of the original MDA method.Thirdly, automatic abstracting which takes sentence as its basic unit will often face the problem of fluency or information redundancy. A new way to realize automatic abstracting based on automatic Chinese sentences grouping method is introduced. The idea is that the discourse is made of different subjects. And every subject should be described by the sentences group. Because sentences group has the features of relatively independent semantic, complete syntax definition and compact logic relation, rather than the discrete sentences.The result shows that it is more suitable than traditional method which uses sentence or passage as its basic unit. |