Font Size: a A A

Research On Chinese Sentences Grouping Method And Its Application

Posted on:2017-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:L Y ZhangFull Text:PDF
GTID:2348330482986984Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
A sentence is the biggest grammatical unit in traditional grammar.It is infeasible to understand one text content by analyzing the meaning of an isolated word or a sentence,because there is an obvious semantic span between them.Sentences grouping research has become one of hot topics in recent years as an important transition between sentence and discourse.Sentences grouping identification is an important direction of computational linguistics research domain.With the rapid growing of text messages in internet,how to retrieve useful information in a quickly,easily and accurately way from the massive data is also a hot research topic.That is one main research direction in the automatic summarization domain.The main research contents of this paper are described as follows.Firstly,this paper introduced the general situation of related research work on sentences grouping and automatic summarization.It synoptically introduced the basic knowledge of sentence grouping theory from the nature and characteristics of the sentence grouping,together with the combinations of methods and means to analyze and summarized the sentence grouping basis.It also described the typical sentences grouping methods,including hierarchical network of concept based method,hierarchical clustering method and multiple discriminant analysis method.Secondly,in view of the lacking of Chinese sentence grouping research in discourse analysis theory,and the sentences grouping is limited to certain rules of language or the markers in a sentence have not been considered in a discourse,this paper proposed a K-means-GA based method for Chinese sentence grouping.The method applied LDA topic model to obtain a sentence's features vector representation,applying the cosine similarity measure and the maximum continuous subsequence to compute internal similarity among one sentence group.The discourse makers as reward and punishment factor were used to correct a sentence's unreasonable division.The experimental results show that the proposed method has better effect than the original K-means-GA method on sentences grouping identification..Finally,the proposed sentences grouping method in this paper was applied to automatic summarization.One sentence or paragraph is considered as the processing unit in most current automatic abstracting methods which cause the obtained abstract context incoherent,redundant.Through consistency analysis on the experimental results,we can know that the abstracts with better quality can be obtained based on sentences group.
Keywords/Search Tags:sentences grouping, automatic abstracting, K-means-GA, topic model, discourse analysis
PDF Full Text Request
Related items