Font Size: a A A

Study Of Center Nodes In Co-occurrence Networks Of Six Different Languages

Posted on:2015-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:P LiFull Text:PDF
GTID:2250330431953492Subject:Basic mathematics
Abstract/Summary:PDF Full Text Request
Complex networks originated in graph theory. The graph representation of real-word networks should be went back to the study of "Konigsberg bridge problem" by using the graphic method by the famous mathematician Euler in the18th century. This study developed graph theory. In200years during the development of graph theory, from stagnation at first to explosive growth in the last100years, especially in the1960s, ER random graph was built by two Hun-garian mathematicians Erdos and Renyi [1], and the establishment of random graph theory pioneered the history of research of complex networks in math-ematics. But most of real complex networks are neither simple nor random. So many scholars began to study many real-world complex networks that have complex structures, and established all kinds of models to match those real-world actual networks. The two most famous models promoted the new age of complex networks and played an epoch-making role. One is the WS model built by Watts and Strogatz in1998[2], which has the ’small-world’ feature exhibited in many realistic networks. Later, Newman and Watts improved the WS model and built the NW model [3]. But these two small-world models are essentially similar. The other one is the BA scale-free network model built by Barabasi and Albert in1999[4]. It has been found that the increasing of nodes and the existence of preferential attachment result in a power-law degree distribution of the network.It has been an important subject for us to find important nodes in a network in the study of graph theory. With the vigorous development of the network science, how to find center nodes in a network has become a funda-mental and important problem in the study of complex networks [5]. Research of importance of nodes in complex networks originated in analysis of the social networks [6]. Freeman and other scholars have done a lot of research works in social networks. After that, systematic scientific research, information re-search, and literature retrieval research and so on also independently proposed similar problems.In2007, Henan with his coauthors summarized several meth-ods to discover important nodes in a network, including degrees, closeness, betweeness, and deleting node method [5]. In2013, Liu with his coauthor a-gain from a different angle introduced different indicators to discover important nodes based on network structures:degree, betweeness, closeness, eigenvector and k-core decomposition, and summarized the advantages and disadvantages of various sorting method and the application of the different environmen-t, such as closeness method is not suitable for random networks and k-core decomposition method is not suitable for tree network [7].Language is a quintessence of the civilization of human beings, and can be viewed as a complex adaptive system formed by a long-time evolution [8]. It is not only a kind of network, but also a kind of complex network. In2001, based on the British National Corpus, Cancho and Sole constructed two word co-occurrence networks [9]. As far as we know, this was the first time people used the method of complex network to study human language. Since then, researchers worked on networks of several different languages by different methods. According to the different definitions of edge, following arc main network generating methods in use:co-occurrence [9-13], syntax [14,15], semantics [16,17], or conception [18]. The study found that most language networks are small-world and scale-free.Methods constructing language networks each has its strong point, but researchers focus on the study of the overall characteristics of network, and the local characteristics of language networks are buried in the overall charac-teristics [19]. The study of local characteristics of language networks should start from the node, and the role and the importance of different nodes in the network is different. Centrality index of node can reflect the role of nodes in the network and its influence to the other nodes. In2011, Chen, Liu [19] studied the center nodes of the Chinese syntactic network by investigating character-istic parameters of network (such as the degree, the in-degree, the out-degree, closeness, in-closeness, out-closeness, the middle degree) and removing nodes, and studied center words from the perspective of complex network revealing the importance of the center words by quantitative analysis. So far, we haven’t found other papers researching center nodes of language network.Gao with his coauthors [20] chose100reports from the documents of the United Nations, and makes parallels text (i.e., a collection of text with the same semantic content but different languages) of six languages to construct word co-occurrence networks of six languages. Based on their works, from the perspective of complex networks, we shall apply the analysis methods of social networks (i.e., the degree, betweeness, closeness, eigenvector, and k-core decomposition) and the analysis method of system science (namely, node dele-tion method) to study and sort importance of center nodes of the six language networks, and compare their commonalities and differences. Applying the two methods will help us better to find the most essential ane characteristics of human language system.This paper is divided into eight chapters and the main contents are as follows.In Chapter1, we introduce the selection; the segmenting, the method of generating co-occurrence networks of six languages and basic concepts of complex networks.In Chapter2, we study center nodes of the word co-occurrence network of Chinese language. In this chapter, we investigate the directed and weighted word co-occurrence network of Chinese language. By the analysis and compar-ison of the degrees and word frequencies, we find degree and word frequencies of the first five nodes are the same and choose the first four nodes as center nodes; By the method of centrality analysis of the node of the social network, we computer several network centrality parameters and obtain that the four center nodes of the network are both local and overall center nodes, but of different importance in the network. Further, we find that the rank of impor-tance of nodes in the network is the same as that of their degrees and word frequencies, and so come to the conclusion that node "的" is the most cen-tral network node, and the overall centricity of node "在" is better than node "了" This is consistent with the conclusion of Chen and Liu [19] in the tudy of syntactic network centrality. Next, by the method of deleting nodes, we study the change of statistical parameter values and the destructive power of network connectivity to further determine their importances. We find that the conclusion is the same as that obtained by the centrality method.In Chapter3, by the method used in the second chapter, we study center nodes of the directed and weighted word co-occurrence network of English language, comparing degrees and their word frequencies, we select three nodes with highest degree values as the center nodes. By the method of centrality analysis, we computer network centrality parameters, and obtain that the three center nodes of the network are both local and overall center nodes, but have different importance in the network. Their local centrality rank is that C英<A英<B英. The degree of node B英is not the biggest, but its strength is the biggest, so that it has the largest local influence. This also shows that the weighted networks can exhibit more detailed and complete analysis of strength among nodes. It is an advantage of weighted networks. The ranks of overall centralities and degree of these three nodes are same by using the method of deleting nodes.In Chapter4, by the methods, used in the second chapter, we study center nodes of the directed and weighted word co-occurrence network of Russian language. Comparing degrees and word frequencies of the nodes, we select three nodes with the highest degree values as the center nodes. By the method of centrality analysis, we investigate several network centrality parameters, and obtain that the three center nodes are both local and overall center nodes, but of different importance in the network. Their local centrality rank is that Cm<Bm<Am. The ranks of their overall centrality and degrees are same. After then, we study the importance of nodes by the method of deleting nodes, and find that the conclusion is consistent with that obtained by the centrality method.In Chapter5, by using the same method, we study center nodes of directed and weighted word co-occurrence network of Arabic language. Comparing degrees and word frequencies of the nodes, we select three nodes with the highest two parameter values as the center nodes. By the methods of centrality analysis and node delection, we obtain that three center nodes are local and overall center nodes, and their centrality rank is:C阿<B阿<A阿. Finally, we find that the conclusion of using node deletion method is consistent with that obtained by the centrality method.In Chapter6, by using the same method, we study center nodes of the directed and weighted word co-occurrence network of French language. Com-paring degrees and word frequencies of the nodes, we select the three nodes with the highest degree values as the center nodes. By the methods of central-ity analysis to investigate multiple network centrality parameters, we obtained that the three center nodes are local and overall center nodes, and their local centrality rank is:C法<A法<B法. The rank of degree and repetition of A法and B法are different, and the difference of repetition is larger. At this time, the repetition of nodes is crucial for local centrality. Their overall centrality rank is:C法<A法.<B法. The ranks of overall centralities and degree of these three nodes are different. Finally, we find that the conclusion of using node deletion method is consistent with that obtained by centrality method.In Chapter7, by using the same method to study center nodes of directed and weighted word co-occurrence network of Spanish language. Through the comparison of degree and word frequencies, we select three nodes with the highest two parameter values as the center nodes. By the methods of centrality analysis and node deletion, we find that three center nodes of the network are local and overall center nodes. The rank of centrality and that of degree is the same.In Chapter8, we mainly compare the results of the study on center nodes of the six language networks. From selection of the node, the centricity and the method of node deletion three aspects, we inspect common properties and personalities of center nodes of the six language networks...
Keywords/Search Tags:Center node, Degree centrality, Strength centrality, Close-ness centrality, Betweeness centrality, Eigenvector Centrality
PDF Full Text Request
Related items