Font Size: a A A

Research On Tibetan Web Community Discovery Algorithm Based On Link Analysis

Posted on:2013-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:F R ChangFull Text:PDF
GTID:2248330395470825Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of the Internet technology, more and more Tibetan culture information is also appearing with the Web page form. In recent years, the Tibetan sites are increasing with appalling rapidity, and these Tibetan Web data have the characteristics such as large data sizes and lacking organization. How to quickly and effectively finding the useful information from these Tibetan Webs has become a hotspot for study. Researches have found out that there are a large number of communities in the vast and complex Tibet webs, and these communities are very important to research "hot-spot" society problem. The communities can provide timely and valuable information for users. The communities also can reflect those complex agglomerate relationship and hierarchical relationship which are widely exist in the Webs. At the same time, a deeply longitudinal study of Tibet communities can help us not only know the cultural development but also master the social developing trends of Tibetan Areas in time. Using the Web community discovery algorithm on the research of search engine exploitation can improve the accuracy of network information search technology, and then provide theoretical foundation to the exploitation of better search engine.Link relationships of web page provide abundant information clues for the research of Web community discovery. And link analysis is one of key technologies in Web community discovery.This paper investigated the data features of current Tibetan Webs and its links, based on the analysis of some basic theories such as the definition of Web community and link analysis technology etc., and then learned about the Web community discovery algorithm technology based on link analysis:link condensing algorithm and link splitting algorithm. This paper focuses on the community discovery algorithm based on extreme optimization in the splitting algorithm, and finds out the existing problems in this community discovery algorithm:the time complexity of dividing the known communities is high and there is no good method to divide the unknown communities. This paper proposes a improved extreme optimization algorithm based on the bifurcation, and this algorithm can not only divide the Webs which have bifurcation in the community but also can divide the communities successfully in the Webs which have unknown numbers of community. The algorithm has also improved the application range of community discovery algorithm at the same time. Finally this paper puts forward the effective experiment scheme to verify this algorithm, and the algorithm is also applied into the Tibetan websites. A lot of experiments have shown that the improved algorithm in this paper can improve the quality of discovering the Web community, and has important theoretical and practical application values.
Keywords/Search Tags:Web community, link analysis, community discovery, bifurcation, extreme optimization
PDF Full Text Request
Related items