Font Size: a A A

Extracting Language Rhythm And Applying It In Text Analysis

Posted on:2012-03-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:F ChenFull Text:PDF
GTID:1268330392469711Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Language rhythm is an important feature of language, which existing wildly andapplying in speech recognition and literary aesthetics and etc. It is a complex andcomposite concept, then there is not a unified define that can be recognizedextensively. Each Research field has its own view about it, so it is difficult to studyLanguage rhythm quantitatively. Thus, abstracting all the research successions, it issolved that how to define and build the language rhythm in this paper. Furthermore,extracting its character for analysis achieve very well and it prove that languagerhythm character is appropriate in text analysis. The main contents of this dissertationare as follows:(1)Researching the essence and nature in language, the definition of languagerhythm is accomplished in hierarchy. Language is very complex, it involves nature,grammar, logic, emotion and etc connotation, but there is a rule in it: rhythm. Rhythmis an outstanding characteristic for language. Then the language rhythm is portionedfor four: nature rhythm, grammar rhythm, logic rhythm and emotion rhythm. Andeach of them has been defined in this paper and their properties have been discussedfully.(2)Analyzing and designing the efficient method of extracting language rhythm.By researching the intention of language the rhythm in language is appeared.Thinking of their own character, the methods for extract the nature rhythm, grammarrhythm, logic rhythm and emotion rhythm are described here. And the languagerhythm unit, rhythm array and etc are defined too.(3)Finding how to extract the character of language rhythm. Two methods promptin this paper. One is to build the state transition matrix of language rhythm and theother is to create the language rhythm network. Because language is show in turn, andit can be describe transferring from one state to another one. Then the state transitionmatrix can denoted the language rhythm. For each language rhythm unit adjacent withanother one, then there is a kind of relation between neighbor, and the languagerhythm network is created by the node: rhythm unit and the edge that exited in twoneighbor nodes.(4)Applying the language rhythm feature in text analysis. Some tests are designedfor proving. The tasks, such as text classification, author distinguishes, styleidentifying, and topic finding, are accomplished successfully by using languagerhythm feature. Bayesian classifier and k-means are used in analyzing language rhythm feature. It is proved that language rhythm feature is suit for text analysis.(5)Analyzing the feature in language rhythm network, some nature of languagehas been discussed. By analyzing the language rhythm network, a truth is appearedthat it is a complex network with small shortest average distance, high clusteringcoefficient and scale-free. Studying the language rhythm of Masterpiece, find that itsnetwork has the salient features of "small-world "network, and its shortest averagedistance and clustering coefficient product is high markedly. And tested the same inthe work of phenomenon, the same happened. It is concluded that the ability forcontrol language can analyses by language rhythm network.
Keywords/Search Tags:Language Rhythm, Markov Process, Complex Network, Text analysis
PDF Full Text Request
Related items