| In recent years,there has been an explosive growth of content across various social media platforms with the rapid development of Internet technology.People always publish some opinion data around a specific topic on social media.Over time,a large number of public opinion data which is related to the specific topic has been accumulated,which provides data sources for studying events under the specific topic.However,the characteristics of social media content like real-time,large-scale,and nonstandard text bring many challenges and difficulties to the event analysis for the specific topic.This thesis carried out relevant research,designed and implemented an event analysis system for the specific topic based on multi-source social media in view of above challenges and difficulties.Specifically,the main work of this thesis is as follows:(1)This thesis developed an efficient crawler system for Sina Weibo and Sina News to collect public opinion data in real-time according to the topic keywords,which can be used to collect public opinion data for social media-related research.In addition,this thesis trained a model based on TextCNN to filter public opinion data irrelevant to the topic.(2)This thesis proposed an effective method of entity extraction and alignment and developed a domestic location identification toolkit BUPTLER,which can identify domestic location entities in text extremely quickly and map the entities to standard three-level administrative regions in the form of provinces,cities,and counties.(3)This thesis proposed a real-time event detection algorithm for social media based on Single-Pass,which cannot only detect events on the specific topic by entities,hashtags,and forwarding relationships efficiently and quickly but also cluster public opinion posts related to the event for future research.Some events have great influence and long duration so they have value to further research.This thesis carried out research on the generation of the event stage and used the sub-event detection algorithm based on DBSCAN to generate the stage of the event.(4)Based on the above research results,this thesis developed an event analysis system for the specific topic based on multi-source social media.A variety of software engineering design patterns and the latest development technology in the industry were applied to the system to ensure efficient operation.The system has a powerful web visualization system so that users can view the event detection and public opinion analysis on the specific topic intuitively and conveniently. |