Font Size: a A A

Reserch Of Blog Information Collecting System Based On Consortium Mining

Posted on:2010-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:T ZhenFull Text:PDF
GTID:2178330332478509Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Blog site, which issues various personalized information and updates frequently, is gradually becoming a important information-obtaining source. With its increasing intelligence value, it also draws attention from relevant authorities. How to effectively target and comprehensively analyze the information released by Blog and mining useful information rapidly, has become an urgent problem to solve facing security department.In this thesis, the structure and characteristics of Blog has been analyzed in depth, and the current technologies and achievements had been studied and validated. A prototype system aiming at monitoring Blog sites has been designed, which firstly collect the information from the page, secondly mine the text, lastly analyze the social network. The system mainly consists of data collection module, content filtering module and the network analysis module. Each module can be operated independently and cooperatively. The jobs mainly include: A Crawler mainly monitoring Blog site has been designed and implemented. A Blog database has been set up, and the Blog text classification tools has established by means of database language design. With social networks set up using the topic and the comment relationships in Blog, method of highlighting the central figure, the core of the page and the core of content by finding out community structure from Blog was put forward. The analytical results were also demonstrated by graphical display. Finally, performance of the system is verified by acquiring real data from Internet. It proved that the system is overall stable. It can collect Blog data, classify the text and set up the Network rapidly with accurate analysis results. This system can provide great help and guidance to later artificial work.
Keywords/Search Tags:Blog, Complex Network, Bayesian Classifier, SNA(Social Network Analysis), Betweenness
PDF Full Text Request
Related items