Font Size: a A A

Intelligent Microblog Information Generation Strategy Based On Subject Crawler And Text Categorization

Posted on:2014-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:X W ChenFull Text:PDF
GTID:2268330422464771Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Mobile Internet, mobile microblog has been the fastestapplication in utilization rate. More and more people choose mobile microblog to interactfor its instantaneity and We Media. Microblog provide different service according todifferent user groups. Meanwhile, a variety of professional microblog came into being, suchas agriculture microblog, real estate microblog. Gathering and pushing the rich industryinformation on the internet according to different industry topics automatically willfacilitate the information reception of mobile phone and elevate the value of professionalmicroblog. This would create a great theoretical and practical value.In the mechanism of intelligent microblog information generation which is based onsubject crawler and text categorization, we gather and classify the rich industry informationonline according to different topics and then push them to the given groups. First of all, weput forward a strategy according to the given areas in the gathering of topic information.After the extension of the open-source crawler framework Heritrix, crawl will only graspinformation that is related to the given topics. Then, during the manipulation of web data,improve the text classification algorithm, and create a Chinese web text classifier. Thisclassifier can classify the web according the topic automatically and grasp the date for thegeneration of valued information. Then present the classified information through mobilephone microblog by setting different channels. At last, use this strategy in Hainanagriculture phone microblog to generate and push the agricultural information and test theeffect.The result shows that this method can gather the latest information of the agriculture,and classify it accurately for easy query. The precision of the subject crawler arrive at87%,and the F1index of web text classifier reach about85%. So the strategy of intelligentmicroblog information generating is effective.
Keywords/Search Tags:Microblog, Subject Crawler, Text Categorization, Agricultural Microblog
PDF Full Text Request
Related items